1
1
Граф коммитов

10441 Коммитов

Автор SHA1 Сообщение Дата
Gleb Natapov
097b17d30e Prevent a receive request from been freed while other thread holds a reference
to it or there is an outstanding completion for the request.

This commit was SVN r16153.
2007-09-18 16:18:47 +00:00
Tim Mattox
164a577908 Add another entry to the Open MPI 1.2.4 NEWS.
This commit was SVN r16150.
2007-09-18 15:47:46 +00:00
Aurelien Bouteiller
f762850242 Split run_global into process_project and process_framework. This allows for calling only process framework to create components internal sub-frameworks
Minor change to ompi_mca.m4 to move AC_CONFIG_FILES(framework/makefile) in autogen process (instead of configure process), where we still now the actual framework path (instead of guessing using $project/mca/$framework). 

This have shown no side effects in our testing. Let us know if this breaks one of your components in some exotic context. 

This commit was SVN r16146.
2007-09-18 10:36:08 +00:00
Jeff Squyres
f9b9beba77 Allow the LSF components to be shipped in the nightly tarball and open
it up to others.

This commit was SVN r16143.
2007-09-17 22:42:33 +00:00
Jeff Squyres
33955a0ed0 Oops -- when converted from uint to int, -1 (the default value,
meaning "infinite") is no longer larger than the minimum required
size.  So put in an appropriate test to ensure that "infinite" was not
requested. 

This commit was SVN r16142.
2007-09-17 19:28:21 +00:00
Jeff Squyres
130a272cec Fix some compiler warnings about signed/unsigned comparisons.
This commit was SVN r16139.
2007-09-17 13:08:45 +00:00
Shiqing Fan
d4a7fb1378 - A small fix of format.
This commit was SVN r16138.
2007-09-17 12:10:04 +00:00
Josh Hursey
d2ef0d445a Add some basic timing hooks so I can extract a few more detailed performance
numbers for tuning.

Switch the bookmark_recv to be non-blocking. If this is blocking then for
process counts >= 32 slight process delays were causing cascading performance
delays in the protocol. This lead to checkpoints either taking about 3 sec or
45 sec (or more) for 64 procs due to the cascading delays. With the nonblocking
receive version this is no longer the case we get the speedup we expect for this
part of the protocol.

More tuning to come.

This commit was SVN r16137.
2007-09-16 15:13:23 +00:00
Tim Prins
a194896ae8 Reverts r16130.
There is a reason that we use the internal type (ompi_file_errhandler_fn) instead of the MPI typedef. When building without MPI-IO support (--disable-mpi-io), the MPI type is not defined, but the internal type IS defined in order to try to keep binary compatibility for apps that don't use MPI-IO.

This commit was SVN r16136.

The following SVN revision numbers were found above:
  r16130 --> open-mpi/ompi@cf5a38af5e
2007-09-15 11:19:13 +00:00
George Bosilca
02d8e721be Include all new files.
This commit was SVN r16134.
2007-09-14 23:16:12 +00:00
Jeff Squyres
6004e177e0 Fixes trac:1133: if you specify a max freelist size that is too small,
you'll get a helpful error message and the openib BTL will deactivate
itself.

This commit was SVN r16133.

The following Trac tickets were found above:
  Ticket 1133 --> https://svn.open-mpi.org/trac/ompi/ticket/1133
2007-09-14 21:42:56 +00:00
George Bosilca
d32a54d74e There is no values[1] ... How did the compilers goes away with this !!!
This commit was SVN r16132.
2007-09-14 21:33:25 +00:00
George Bosilca
71393fdfd9 Script for generating a Windows specific patch.
This commit was SVN r16131.
2007-09-14 21:25:56 +00:00
George Bosilca
cf5a38af5e There is no reason to use the internal type (ompi_file_errhandler_fn)
while everywhere else we're using the MPI typedef (MPI_File_errhandler_fn).

This commit was SVN r16130.
2007-09-14 21:23:39 +00:00
George Bosilca
6897926dce Not used anymore.
This commit was SVN r16129.
2007-09-14 21:20:19 +00:00
George Bosilca
fa40fd61f8 Update the Windows related project and headers files.
This commit was SVN r16128.
2007-09-14 21:18:52 +00:00
George Bosilca
d1364c53de Don't allocate the temporary buffer on the stack. It get way too much
space.

This commit was SVN r16127.
2007-09-14 02:09:38 +00:00
George Bosilca
2c8c75ef94 Coverty blame list:
- Remove memory leaks
 - uninitialized return

This commit was SVN r16126.
2007-09-14 02:08:37 +00:00
George Bosilca
921d79c2b8 Remove few memory leaks. Close the files where we're done with them.
This commit was SVN r16125.
2007-09-14 02:06:26 +00:00
George Bosilca
41ed50f901 Use secure version of strncpy and srtncat. Release the temporary
resources on error.

This commit was SVN r16124.
2007-09-14 02:04:34 +00:00
George Bosilca
61989cc4d4 Don't hardcode the length, there is an argument for that. Don't
do the NULL check as we already know thaty tmp cannot be NULL.

This commit was SVN r16123.
2007-09-14 02:02:03 +00:00
George Bosilca
4e66376e66 Fix memory leak (Coverty 702).
This commit was SVN r16122.
2007-09-13 20:11:38 +00:00
Ralph Castain
45986ad2aa Add support to signal application procs for LSF
This commit was SVN r16120.
2007-09-13 18:09:14 +00:00
Tim Prins
4033a40e4e Coding standards...
This commit was SVN r16118.
2007-09-13 14:00:59 +00:00
George Bosilca
617ff3a413 Add a MCA parameter for the ELAN MAP ID file.
Fix small memory bugs, and track the final segfault. Still some ork to do.

This commit was SVN r16117.
2007-09-12 21:25:35 +00:00
Aurelien Bouteiller
a1f5312afb Fixed two little warnings
This commit was SVN r16116.
2007-09-12 21:07:11 +00:00
Ralph Castain
9fa254c017 Provide a better error message when a daemon unexpectedly dies under SLURM so we differentiate between fail to start and aborting while the app is running.
This commit was SVN r16115.
2007-09-12 20:53:50 +00:00
Aurelien Bouteiller
ccb3f75e8f Make sure that the pml v parasite never get loaded when user did not requested FT. This does not break the ability to switch protocol on the fly.
This commit was SVN r16114.
2007-09-12 20:47:17 +00:00
George Bosilca
1e7a791349 Remove some of the problems identified by Coverty.
This commit was SVN r16112.
2007-09-12 20:13:26 +00:00
Rolf vandeVaart
a289ac114a 1. Remove some #ifdef 0 code.
2. Remove some unnecessary code that was causing a SEGV. 
There may be some more work to be done, but at least orte-clean is functional again. 

This commit was SVN r16111.
2007-09-12 19:50:58 +00:00
Aurelien Bouteiller
828af95be8 Major modification of the vprotocol framework build system. With a better integration in autogen.sh, it allows for generating static-components.h the usual way.
NOTE: This build system does not work with the current autogen.sh. Modified one is under heavy testing to make sure it does not have side effects 

This commit was SVN r16110.
2007-09-12 18:46:37 +00:00
Josh Hursey
b4c68c0925 Turn back on the absolute path protection for the moment.
It is masking a bug that I'm tracking down in the SNAPC FULL - FILEM interations

Also make sure to cleanout the filem structure before asking for another
checkpoint file when not storing the files in place.

This commit was SVN r16109.
2007-09-12 18:19:39 +00:00
George Bosilca
e5d316dba6 Coverty: fix issues with using a string once it get freed. The problem, is that the
mca_base_register_string don't set the result to NULL is an error occurs.

This commit was SVN r16108.
2007-09-12 18:16:53 +00:00
George Bosilca
7b3dcff267 Coverty: Limit the strcpy to the maximum length of the destination.
This commit was SVN r16107.
2007-09-12 18:03:53 +00:00
George Bosilca
bfb4ddc3e2 Coverty: remove dead code.
This commit was SVN r16106.
2007-09-12 17:56:33 +00:00
George Bosilca
05ae27c68b Don't segfault if we receive a fragment for a non existing communicator.
Instead, drop it by now.

This commit was SVN r16105.
2007-09-12 17:52:02 +00:00
George Bosilca
c755938eb0 Coverty: release the temporary buffer on error.
This commit was SVN r16104.
2007-09-12 17:45:12 +00:00
George Bosilca
2b7ed6262b Update the communicator lowest_free when we rebuild the communicator list.
This commit was SVN r16102.
2007-09-12 16:41:14 +00:00
Shiqing Fan
b1ea3e0054 - add more lines for static import declaration on windows.
This commit was SVN r16101.
2007-09-12 15:32:54 +00:00
Shiqing Fan
a0660f4deb - Just some type casts.
This commit was SVN r16100.
2007-09-12 15:29:58 +00:00
Josh Hursey
b4735c9719 Remove an old workaround in which we had to 'mv' the checkpoint file after it
was taken form the $CWD to the storage directory. Now we just store directly
to the storage directory which can reduce NFS traffic if working in that mode.

A slight performance boost, but at the point you are using NFS you are paying
a penalty anyway. Now you just don't have to pay it twice :)

This commit was SVN r16099.
2007-09-12 15:03:21 +00:00
Ralph Castain
f80ea093a2 Ensure that the orteds do not directly respond to USR1/2 signals. Those signals are trapped by mpirun and propagated from there - at most, the orteds are involved in the propagation process, but should never do anything on their own.
This commit was SVN r16098.
2007-09-12 14:32:31 +00:00
Gleb Natapov
07c8fddeef Fix scheduling of pending send request. It should be scheduled req_lock times.
This commit was SVN r16096.
2007-09-12 07:08:38 +00:00
George Bosilca
d8fed2cfa1 Set a default value so that some compilers stop complaining about
uninitialized values.

This commit was SVN r16094.
2007-09-11 18:00:53 +00:00
George Bosilca
2e46809995 Only release the comm_reg is we have one.
This commit was SVN r16093.
2007-09-11 17:59:40 +00:00
Shiqing Fan
548a4fe943 - Use IOVBASE_TYPE instead of char to avoid warnings on some systems.
This commit was SVN r16092.
2007-09-11 16:24:23 +00:00
Gleb Natapov
140dce7614 Fix ABA problem in atomic_lifo code. This is temporary solution for now. We
are looking for a better one.

This commit was SVN r16091.
2007-09-11 15:40:30 +00:00
Gleb Natapov
e82a6eec27 Restore check for lowest id. It prevents livelock situation if multiple threads
are inside the function and they failed to obtain new cid the first time around.

This commit was SVN r16090.
2007-09-11 15:32:46 +00:00
Gleb Natapov
58a018c16d The code tries to prevent itself from running for more then one communicator
simultaneously, but is doing it incorrectly. If the function is running already
for one communicator and it is called from another thread for other communicator
with lower cid the check comm->c_contextid != ompi_comm_lowest_cid()
will fail and the function will be executed for two different communicators by
two threads simultaneously. There is nothing in the algorithm that prevent it
from been running simultaneously for different communicators as far as I can see,
but ompi_comm_unregister_cid() assumes that it is always called for a communicator
with the lowest cid and this is not always the case. This patch removes bogus
lowest cid check and fix ompi_comm_register_cid() to properly remove cid from
the list.

This commit was SVN r16088.
2007-09-11 13:23:46 +00:00
Shiqing Fan
c1065d8262 - Some more type casts.
This commit was SVN r16087.
2007-09-11 11:28:43 +00:00