1
1

7447 Коммитов

Автор SHA1 Сообщение Дата
Josh Hursey
2f20a38c98 This is a fix for bug Ticket #27
We were stuck in an infinite loop inside the rmaps round_robin
component when the user specified a host, then over subscribed it.
Instead of retuning an error, we looped forever.

For example:
 $ cat hostfile
  A slots=2 max-slots=2
  B slots=2 max-slots=2
 $ mpirun -np 3 --hostfile hostfile --host B
  <hang>

The loop would not terminate because both host A and B are in the 
'nodes' structure as they are both allocated to the job. However,
after allocating 2 slots to host B, we remove it from the node list
leaving us with a 'nodes' structure with just A in it. Since we can't
use host A, we keep looping here until we find a node that we can use.

This patch checks to make sure that if we get into this situation where
rmaps is looping over the list a second time without finding a node
during the first pass then we know that there are no nodes left to
use, so we have a resource allocation error, and should return to the user.

This patch should be moved to all of the release branches

This commit was SVN r10131.
2006-05-31 03:42:01 +00:00
Brian Barrett
f4a7e9be78 * Fix shell quoting to be more consistent with what AC does
Submitted by: Ralf Wildenhues
Reviewed by: Brian Barrett

This commit was SVN r10130.
2006-05-31 03:40:26 +00:00
Brian Barrett
6026fc98f6 * Fix M4 quoting so that AC 2.60 won't complain
Submitted by: Ralf Wildenhues
Reviewed by: Brian Barrett

This commit was SVN r10129.
2006-05-31 03:39:18 +00:00
Brian Barrett
7e1befaab8 * Fix M4 quoting so that AC 2.60 won't complain
Submitted by: Ralf Wildenhues
Reviewed by: Brian Barrett

This commit was SVN r10128.
2006-05-31 03:37:31 +00:00
Brian Barrett
c723d196c5 Rather than using fragment size to determine fragment type, use an enum.
Do this rather than the my_list pointer because we need to do some
things that are somewhat special because we pre-pin eager fragments but
not send fragments.  Also makes a couple ideas I have slightly easier to
play around with.

This commit was SVN r10127.
2006-05-31 03:34:32 +00:00
Brian Barrett
0b09ba928c Fix for bug #17. Solaris provides stubs (who knows why, but they do) for
the pthreads library that don't do anything but are there when no special
options are given.  Both the GNU compiler and the Sun compiler could
sometimes ignore the -K* options, causing badness when building with
posix threads.  Don't try those options ;).

We still try -pthread and -pthreads because the compilers *do* error
when they see those options and some versions of the GNU compiler do
understand those flags (and do all the right things in that case).

This commit was SVN r10126.
2006-05-31 00:23:49 +00:00
George Bosilca
abc580b2d5 Sven patch to check the optimized description before freeing it. For predefined
datatypes the optimized description point to the default description, so special
care should be taken before freeing it.

This commit was SVN r10119.
2006-05-30 16:36:06 +00:00
Jeff Squyres
5f356edb64 Bring over changes from the /tmp/fortran-stuff series:
- Make the F90 bindings compile and link properly with gfortran 4.0,
  4.1, Intel 9.0, PGI 6.1, Sun (don't know version offhand -- the most
  current as of this writing, I think), and NAG 5.2, although some
  have limitations (e.g., NAG can't seem to handle the medium and
  large sizes)
- Building the F90 "small" module size is now the default, even for
  developers
- Split up mpif.h into multiple files because parts of it were toxic
  to the F90 bindings
- Properly specify unsized/unshaped arrays to make the bindings work
  on all known compilers
- Make ompi_info show Fortran 90 bindings size
- XML somewhat lags the generated scripts as of this commit, but
  functionality was my main goal -- the XML can be updated later (if
  at all).

This commit was SVN r10118.
2006-05-30 14:37:41 +00:00
Jeff Squyres
3daed7aaa1 Change svn:ignore before a larger commit -- had some problems with svn
merging in new files that were ignored by a current svn:ignore set.

This commit was SVN r10117.
2006-05-30 14:33:14 +00:00
George Bosilca
aa1c1e70c6 Fix the datatype bug noticed by Rainer. Under some circumstances (and only for
predefined datatypes) the optimized description was set to NULL instead of
pointing to some valid description. As for some data, having an optimized
version is not possible (as no optimizations bring any benefit), we have
to make sure this field (opt_desc) is always correctly initialized.

This commit was SVN r10112.
2006-05-27 06:21:27 +00:00
George Bosilca
9da7af4c96 Remove all warnings except the missing prototypes.
This commit was SVN r10108.
2006-05-26 20:53:35 +00:00
Jeff Squyres
0c1e9eaa2e Update svn:ignore
This commit was SVN r10102.
2006-05-26 19:46:36 +00:00
Jeff Squyres
b4f83aa471 Add bullet about MPI_LONG_LONG and MPI_LONG_LONG_INT.
This commit was SVN r10091.
2006-05-26 12:21:52 +00:00
Jeff Squyres
6125fc73d3 Sync with 1.0 NEWS.
This commit was SVN r10089.
2006-05-26 12:09:41 +00:00
Galen Shipman
2667c52a5d Track fragments by list, not by size..
-- reviewed by Brian, needs to hit all the branches.. 

This commit was SVN r10078.
2006-05-25 18:07:26 +00:00
Galen Shipman
38a0561d9b Allow maximum send size to be less than the eager limit.
Instead of figuring out which free list the fragment belongs to based on size
we simply store a pointer to the list which it belongs in the fragment.

This was reviewed by Brian and should hit all the branches.

This commit was SVN r10072.
2006-05-25 16:57:14 +00:00
Andrew Friedley
fa9ec2afdf Add my sandia username for convenience
This commit was SVN r10071.
2006-05-25 15:49:11 +00:00
Andrew Friedley
8a3d0862ca I can commit! *happy dance*
Trying to remember what I did here.. eager/max messages should work now, no RDMA yet.  A number of other fixes and cleanups.

I do know of two problems:
 Bad stuff happens when flooded with send frags too quickly - the BTL doesn't handle flow control.
 Certain IBM tests turn up a length assertion in the datatype engine - needs more investigation.

This commit was SVN r10070.
2006-05-25 15:47:59 +00:00
Jeff Squyres
b4e1a61cc4 Add svn:ignore
This commit was SVN r10067.
2006-05-25 12:33:15 +00:00
Jeff Squyres
8c1b89d98a Slight re-wording of this bullet
This commit was SVN r10064.
2006-05-25 12:29:17 +00:00
Jeff Squyres
3f93d32dfa - Add note about sm async progress fix
- Ammend note about ROMIO fixes to include mention of "run-time" fixes

This commit was SVN r10063.
2006-05-25 12:28:04 +00:00
Gleb Natapov
f590d8a190 fix eager RDMA on PPC64.
This commit was SVN r10059.
2006-05-25 11:05:12 +00:00
Jeff Squyres
dd44d36be0 Fix for ticket #25. Ensure that in the threaded case where we have
This commit was SVN r10043.
2006-05-24 16:15:07 +00:00
George Bosilca
1c55956db1 Extend Sven patch for pack/unpack.
This commit was SVN r10040.
2006-05-24 14:48:00 +00:00
Jeff Squyres
a553c3444a This has bugged me for a long time: make the "want libltdl" output
like the rest of the output (i.e., "yes" / "no" vs. "1" / "0").

This commit was SVN r10039.
2006-05-24 10:56:47 +00:00
Jeff Squyres
3c265958ba @#$%@#%#%
Fix one more typo that was missed last night.

This commit was SVN r10038.
2006-05-24 10:30:08 +00:00
Jeff Squyres
8c0ebb4897 Drat -- forgot the copyright.
This commit was SVN r10025.
2006-05-23 18:42:11 +00:00
Jeff Squyres
8ccafdb521 Updates NEWS for MPI_PROD fix
This commit was SVN r10024.
2006-05-23 18:28:41 +00:00
Jeff Squyres
dc9a16581e Unbelieveable how this lived so long. Thanks to Bert Wesarg for
reporting this.

This commit was SVN r10023.
2006-05-23 18:00:44 +00:00
George Bosilca
e832aac7b1 This is always on the critical path so let's make it static inline.
This commit was SVN r10020.
2006-05-23 03:22:15 +00:00
George Bosilca
95d0395578 I'm skeptical about the ability of the compiler to correctly optimize the
loop local variables.

This commit was SVN r10019.
2006-05-23 03:21:15 +00:00
George Bosilca
085cac552f Don't let TCP to create local connections, we have the self BTL for this purpose.
This commit was SVN r10018.
2006-05-23 03:06:32 +00:00
George Bosilca
837221831a Temporary solution for in-bound computation of the next BTL.
This commit was SVN r10016.
2006-05-22 23:28:40 +00:00
Rainer Keller
772bba620d - Allow --enable-mca-direct for VPATH builds.
This commit was SVN r10007.
2006-05-22 14:24:30 +00:00
Rainer Keller
7cece521c6 - Use calloc as per suggestion of George.
This commit was SVN r10006.
2006-05-22 14:18:44 +00:00
George Bosilca
1dcd70ad80 The master convertor is the one that knows if the peers are
homogeneous or heterogeneous.

This commit was SVN r10005.
2006-05-22 06:22:32 +00:00
George Bosilca
6df7bf1a0f Remove one useless test.
This commit was SVN r10004.
2006-05-22 06:13:49 +00:00
George Bosilca
eb149cb9c8 Move the datatype tests in its own directory.
This commit was SVN r10003.
2006-05-22 06:12:43 +00:00
George Bosilca
b8ef0cc749 Minor cleanups.
This commit was SVN r10001.
2006-05-21 05:55:21 +00:00
George Bosilca
e43fbd0082 Remove all useless variables. Minor cleanups.
This commit was SVN r10000.
2006-05-21 05:53:22 +00:00
Galen Shipman
9165882c07 fixes for failover...
This commit was SVN r9998.
2006-05-20 02:39:05 +00:00
Jeff Squyres
faf63c68f8 Merge over from the /tmp/fortran-stuff branch
- split mpif.h into mpif.h and mpif-common.h[.in]
- mpif-common.h is included by various f90 things and contains output
  from configure
- mpif.h defines some f77-specific stuff and then includes
  mpif-common.h 

This commit was SVN r9997.
2006-05-20 02:15:49 +00:00
George Bosilca
1fbccda986 -g3 is definitively not a standard gcc option, at least not on anything
else than a quite recent version. Using this option prevent gdb from accessing
the contents of some of the structures. The error message is:
Unexpected type (0) encountered for integer constant.

This commit was SVN r9994.
2006-05-19 22:09:29 +00:00
Brian Barrett
96bf81a329 * datatype_check might need to uptdate the value of count (if we received
less than we posted for).  We were passing by value, so this update was
  not being propgated back up the stack and we could segfault.  Make the
  count argument a pointer so that updates will be passed as expected.

This needs to go to the v1.1 branch

This commit was SVN r9991.
2006-05-19 21:58:12 +00:00
Jeff Squyres
58c8478bcd Add note about Torque 2.1.0p0
This commit was SVN r9977.
2006-05-18 21:25:56 +00:00
Jeff Squyres
8e26fd653d Torque changed the name of their library from libpbs to libtorque; this
commit updates our configure test to check for both names.

This commit was SVN r9976.
2006-05-18 21:03:38 +00:00
Jeff Squyres
299f4fdb2c Oops -- fix the comment. A victim of cut-n-paste.
This commit was SVN r9971.
2006-05-18 18:10:12 +00:00
Jeff Squyres
942f9e8f8d Fixes for ticket:14. Lengthy discussion is on that ticket and in a
comment in ompi_comm_invalid() in
source:/trunk/ompi/communicator/communicator.h.

Short version:
- ompi_comm_invalid() returns TRUE for MPI_COMM_NULL
- therefore MPI_COMM_C2F needs to explicitly check for MPI_COMM_NULL
  (because it uses ompi_comm_invalid())
- make ~20 MPI functions only call ompi_comm_invalid() instead of
  calling ompi_comm_invalid() *and* checking for MPI_COMM_NULL (~40 MPI
  functions already only called ompi_comm_invalid() -- we should be
  consistent)
- similar issue for ompi_win_invalid(), so I added a cross-referencing
  comment in win.h and fixed MPI_WIN_SET_NAME to only call
  ompi_win_invalid() (and not check for MPI_WIN_NULL)

This commit was SVN r9970.
2006-05-18 18:05:46 +00:00
George Bosilca
372ae03535 There should be one gap between the constructors and the destructors, otherwise
the last constructor will be set to NULL overwriting the first destructor. This
prevent us from calling the destructors on some classes.

This commit was SVN r9969.
2006-05-18 16:21:29 +00:00
Brian Barrett
7000cecf78 Fix for standard output / standard error truncation issue when in a shell
pipeline.  See lengthy comment in iof_base_endpoint.c for the details, but
the short version is that we shouldn't set O_NONBLOCK on standard I/O 
file descriptors, so we no longer do.

Closes ticket:9

This commit was SVN r9966.
2006-05-18 15:43:32 +00:00