1
1
Граф коммитов

6285 Коммитов

Автор SHA1 Сообщение Дата
Jeff Squyres
e6a3a406e2 Remove debugging printf
This commit was SVN r8139.
2005-11-13 14:57:44 +00:00
Jeff Squyres
425d255c05 Add documentation about what is happening in this class.
This commit was SVN r8138.
2005-11-13 12:56:38 +00:00
Jeff Squyres
4a208939f3 Don't run ompi_fifo and ompi_circular_buffer tests; the interfaces
have changed and the tests have not changed with them.

This commit was SVN r8137.
2005-11-13 11:33:23 +00:00
Jeff Squyres
7643b7b459 More changes for correctness of the sm btl.
- remove dead code that isn't used anywhere (originally ompi_fifo_t
  was going to be a generalized class, but now it's exclusively used
  in the sm stuff, so there's no point in the generalized code that
  definitely *won't* work with the sm btl, or is not being used now
  [SVN always has history so we can go back])
- had to add an interprocess lock in the area where the writer may
  create a new circular buffer to ensure that the reader's tail
  doesn't accidentally end up back in the same old buffer while the
  head continues on to a new circular buffer (this was what was
  happening to cause some intel tests to hang -- e.g., MPI_Scan_c,
  MPI_Send_fairness_c and MPI_Isend_fairness_c).  Unbelievably, this
  may actually *increase* performance because it may order things
  better.  Will do performance testing tomorrow.  We're fairly certain
  that this lock can probably be removed and the code fixed in a
  different way, but we're under a deadline and correctness comes
  first, so it's been added to the to-do list to come back and
  re-examine this case later.

This commit was SVN r8136.
2005-11-13 05:00:22 +00:00
Jeff Squyres
97b97f84b8 Next checkpoint in the sm btl fixes:
- Add big comment about a general overview of what the sm btl is doing
- random small code cleanups
- fix instances of mca_btl_sm[0] to mca_btl_sm[1] where relevant
- remove a lot of unused, confusing, and incorrect interface functions
  from ompi_fifo.h and ompi_circular_buffer.h.  These functions, if
  they were used, would not work properly with the scheme that the sm
  btl uses with the fifos (i.e., receiver makes right -- if necessary)
- add some missing offset computations in the fifo and circular buffers
- change the types of offsets to be ssize_t, not size_t
- remove an offset parameter from a function that didn't need it

This commit was SVN r8135.
2005-11-12 22:32:09 +00:00
Jeff Squyres
6444887373 - Add copyright headers to btl_sm_frag.h
- Ensure to convert base_shared_mem_flags to be a relative offset in
  the global storage, and then to convert that back to an absolute
  virtual address before we try to use it
- Don't double increment n_local_procs when calculating the peer rank
  during bootstrapping of the different base address case

Something else is still wrong; if mmap() returns a different base
address, things don't work (i.e., segv or hang forever when you try to
send a message).  More specifically, the bootstrapping now seems to
correctly handle the case when mmap() base addresses are different,
but the message passing does *not* -- it always assumes that the
mmap() base addresses are the same.

Still working on the fix for that -- want to checkpoint what has been
done so far to facilitate working on different machines...

This commit was SVN r8134.
2005-11-12 14:04:46 +00:00
George Bosilca
d8d13f879f When --disable-debug is specified we have to explicitly include the optl/util/output.h header.
This commit was SVN r8133.
2005-11-12 04:03:19 +00:00
George Bosilca
932c67aeb3 MPI_COMM_WORLD should be the first communicator who get created even before MPI_COMM_SELF and MPI_COMM_NULL.
This commit was SVN r8132.
2005-11-12 03:47:17 +00:00
Brian Barrett
1066518f3b Fix output of configure --help for the --with-threads option to be posix
instead of pthread.  The code expects posix.

This commit was SVN r8130.
2005-11-12 03:10:52 +00:00
Galen Shipman
5a4b1ebdd4 in mca_btl_openib_endpoint_post_send: set opcode on work request before potentially inserting it on pending list..
This commit was SVN r8127.
2005-11-12 02:11:14 +00:00
George Bosilca
e297b58fbd Add more MCA arguments.
Make some of them system (not seems by the user) and read-only.
Small cleanups.

This commit was SVN r8126.
2005-11-12 00:31:59 +00:00
Galen Shipman
5cf2d8d40c default to first available IP address if no matching subnets found..
This commit was SVN r8125.
2005-11-12 00:31:34 +00:00
Jeff Squyres
24b9de292c Fix for [righteous] compiler warnings from xlf90 compiler on OSX
10.3.  Specifically define what the parameter type is, and mark its
intent. 

This commit was SVN r8124.
2005-11-11 23:18:59 +00:00
Tim Woodall
59d8c791d9 return fragments to free list
This commit was SVN r8121.
2005-11-11 17:48:56 +00:00
Tim Woodall
607f62accd - pass a flag to the peer indicating wether data is contiguous at the soure
- only attempt to schedule rdma if contiguous at both src/dst
- need to review this for next release 

This commit was SVN r8119.
2005-11-11 15:33:25 +00:00
Jeff Squyres
5d4091d485 Squelch some harmless symbols that I'm getting in the nightly reports
This commit was SVN r8118.
2005-11-11 15:25:14 +00:00
George Bosilca
c802d54696 The return type is an int. Casting it to a size_t before checking if it's bigger than zero lead to a true condition ... always ...
This commit was SVN r8114.
2005-11-11 06:34:14 +00:00
Graham Fagg
877f7bbe6a File based dynamic up and tested...
Lots of misc fixes: printfs->opal_output, handles fanin/out correctly for forced ops
unused vars, correct calculations on meaning of 'msgsize' for decision functions
(varies depending on algorithm), etc

This commit was SVN r8113.
2005-11-11 04:49:29 +00:00
Brian Barrett
878676218e Rename opal/memory to opal/memoryhooks because XLC++ on Mac OS X is broken.
When compiling C++ code that includes something that looks for the C++
header file "memory" (stupid C++ headers not having .h extensions), it
goes through the header file search path, which includes $(topsrcdir)/opal,
so it finds the directory $(topsrcdir)/opal/memory/ and tries to load
that as the memory header file and all goes downhill.

This commit was SVN r8111.
2005-11-11 00:26:27 +00:00
Brian Barrett
660d2f61b6 Don't add external declarations for the PMPI_W{TICK,TIME} functions
if profiling isn't enabled.  It appers that some compilers (g95)
will try to resolve the symbols if they are prototyped.

This commit was SVN r8110.
2005-11-11 00:12:40 +00:00
Josh Hursey
5fa34df9ce Fix for orted / MPI_Abort problem reported from testers. They were seeing orteds
spining in orte_iof_base_flush() when running 
  intel_tests/src/MPI_Errhandler_fatal_c

When we close an endpoint by taking it out of the envent handler, we need to make
sure that it fits the criteria to pass through orte_iof_base_flush(), specificly
make sure we clean out the ep_frags list.
Note: This is more of a sanity check, since the endpoint should already be
      in this state at the point of closure.

Secondly in orte_iof_base_endpoint_read_handler(), if we determine that it is 
necessary to close the endpoint we have to "return" after doing so, otherwise
we add another frag to the endpoint which will cause it to hang in 
orte_iof_base_flush().

Bug go squish!

This commit was SVN r8109.
2005-11-11 00:09:07 +00:00
Jeff Squyres
bcd037315f Some Fortran compilers actually will return that a type exists even if
it doesn't support it -- the compiler will automatically convert the
unsupported type to a type that it *does* support.  For example, if
you try to use INTEGER*16 and the compiler doesn't support it, it may
well automatically convert it to INTEGER*8 for you (!).  So we have to
check the actual size of the type once we determine that the compiler
doesn't error if we try to use it (i.e,. the compiler *might* support
that type).  If the size doesn't match the expected size, then the
compiler doesn't really support it.

The F77 configure code actually handled this properly.  The F90 code
did not quite do it right.  This patch brings the F90 code up to the
same structure as the F77 code, albiet not m4-ized properly.  I also
added a comment to config/f77_check.m4 that explains *why* we do this
extra size check (because no explanation was given).

The impetus for this was that xlf* on OS X 10.3 was not recognizing
that INTEGER*16 was not supported, and mpi-f90-interfaces.h was being
assembled incorrectly.  This patch fixes this problem.

There is still one more problem, but waiting for some help from Craig
R on that (function pointers in F90 declarations).

This commit was SVN r8107.
2005-11-10 23:35:36 +00:00
Tim Woodall
654ba6d262 srq cleanup
This commit was SVN r8106.
2005-11-10 23:29:54 +00:00
Tim Woodall
2013104d1a SRQ cleanup
This commit was SVN r8104.
2005-11-10 20:51:56 +00:00
Tim Woodall
4a06e8463c port of flow control from mvapi
This commit was SVN r8102.
2005-11-10 20:15:02 +00:00
Tim Woodall
7f20198d49 Filter the set of data returned to the daemons during
startup using the new get_conditional command to improve
scalability during launch

This commit was SVN r8097.
2005-11-10 16:44:51 +00:00
Jeff Squyres
bacfb4fa2b Remove the generated F90 interfaces for all the "2 buffer" MPI API
functions (e.g., MPI_REDUCE).  We don't generate the back-end
subroutines for them (because it makes an expontential number of
subroutines, and compilers literally will segv), so we shouldn't
generate the f90 interfaces for them, either.  This allows user's MPI
F90 apps to automaitcally fall through to the F77 bindings for these
functions.

This commit was SVN r8094.
2005-11-10 16:04:39 +00:00
Tim Woodall
985c2ca943 cleanup
This commit was SVN r8093.
2005-11-10 15:40:27 +00:00
Tim Woodall
d62ea1835d correct typo
This commit was SVN r8090.
2005-11-10 15:29:52 +00:00
Brian Barrett
86e2adc43a * it appears that including event.h without calling opal_init annoys XLC on
OS X (you get an undefined symbol opal_event_lock).  Since the code is
  all #if 0'ed out, #if 0 out the header for now as well.

  I believe console and openmpi are to be removed from OMPI before 1.0
  release, so this doesn't need to go to the 1.0 branch

This commit was SVN r8089.
2005-11-10 15:24:57 +00:00
Jeff Squyres
6e08072113 Fix for the interface name for MPI_File_write_ordered_begin -- the
name was changed to shorten it too early (and then not restored), so
the "interface" name was not output correctly into
mpi-f90-interfaces.h.  Change to make it like the other long functions
-- temporarily change it to a shorter name while outputing the
subroutines, and then revert it when outputting the end interface.

This commit was SVN r8086.
2005-11-10 14:10:20 +00:00
Tim Woodall
3556757726 init callback from proxy
This commit was SVN r8085.
2005-11-10 05:27:11 +00:00
Tim Woodall
0b0d7f56c1 added support for callback on receipt of I/O
This commit was SVN r8084.
2005-11-10 04:49:51 +00:00
Tim Woodall
3699c924bd callback for init prior to launch - allow app to hookup stdout/stderr
prior to launch

This commit was SVN r8083.
2005-11-10 04:47:41 +00:00
Brian Barrett
d542e14120 * for some reason, some versions of linux didn't like that call to mmap. The
test isn't really needed to make sure the malloc intercept code was working,
  so just revert to not checking for now.  
  
  This should go to the 1.0 branch, as what is there now is causing issues.

This commit was SVN r8081.
2005-11-10 04:13:56 +00:00
George Bosilca
405d9794f8 Somehow I miss to remove one of the previous definition for the unavailable data.
This commit was SVN r8080.
2005-11-10 02:59:20 +00:00
Brian Barrett
5bf0a7bc62 * allow for the fact that svnversion might fail
This commit was SVN r8077.
2005-11-10 02:00:38 +00:00
George Bosilca
3507d5e9cd opal/util/output.h is required for optimized builds.
This commit was SVN r8076.
2005-11-10 01:19:27 +00:00
George Bosilca
8119c970db Improve the connection algorithm for MX. There are 2 problems here:
- first we setup the connections in the begining with all the peers
- MX does not handle well the case where several peers make connections to the same
  destination simultaneously.

So I change the order in which we connect. First we compute our rank in the array,
then in a round-robin fashion we setup connection starting with our left neighboard.

This commit was SVN r8075.
2005-11-10 01:15:49 +00:00
George Bosilca
dc1ad885d1 Move the output message outside the loop. We print an error message only once when we fail to
connect to a peer. Bonus, we print some additional informations like its MAC Address or name
if it's on our tables.

This commit was SVN r8074.
2005-11-10 01:13:18 +00:00
Tim Woodall
4c7c277b0a improve the scalability of MPI_Waitall ...
note that any code that sets a request to a completed state must
now increment a counter for every completed request

This commit was SVN r8073.
2005-11-10 00:45:27 +00:00
Tim Woodall
62fd74140b decrease socket buffers sizes to same as ptl code
This commit was SVN r8072.
2005-11-10 00:40:55 +00:00
Tim Woodall
2f6d50e0c6 init rdma count
This commit was SVN r8071.
2005-11-10 00:04:25 +00:00
Tim Woodall
b5ed723ea4 - check for null return
- disable debug

This commit was SVN r8070.
2005-11-10 00:02:18 +00:00
George Bosilca
55051b81c4 Activate the protection against unavailable datatypes. They get a flag DT_FLAG_UNAVAILABLE. We check now this flag in all the send/recv operations via the macros on mpi/c/bindings.h.
This flag is inherited by all datatypes create with unavailable datatypes. Basically, we let the user create the wrong datatype but we dont let him using it for any pt2pt communications or any pack/unpack.

This commit was SVN r8069.
2005-11-09 23:43:41 +00:00
Brian Barrett
5e6ab09424 apparently there are some platforms that do not like having munmap(NULL, 0)
called.  Instead, map and then munmap something.

This commit was SVN r8067.
2005-11-09 23:42:49 +00:00
Tim Woodall
78c98386d7 should reset the count (for persistent requests)
This commit was SVN r8064.
2005-11-09 22:02:48 +00:00
Tim Woodall
58b46d2da0 return mpool resources when request completes rather than in free
This commit was SVN r8063.
2005-11-09 21:59:01 +00:00
Graham Fagg
6b99301893 extra verbose in debug mode to help occ
This commit was SVN r8061.
2005-11-09 21:01:35 +00:00
Edgar Gabriel
b3d3552900 Fix for a problem Brian pointed out with cartesian communicators: in
comm_fill_rest there is no need for calling ompi_set_group_rank, since
we know already the rank of the process in the new comm. In case the
process was not part of the new communicator (rank = MPI_UNDEFINED)
calling this function caused a segfault on some platforms.

This commit was SVN r8060.
2005-11-09 21:00:58 +00:00