1
1

4203 Коммитов

Автор SHA1 Сообщение Дата
Ralph Castain
df7ef31b61 Plug a memory leak that crept into the registry.
Modify the locking scheme to try and resolve a problem with dump_triggers that only occurs with multiple processes. Didn't resolve the problem, but should be more robust anyway. Still tracking this one down.

This commit was SVN r5114.
2005-03-31 21:32:12 +00:00
Tim Woodall
74cc7ba381 on abnormal termination - dump info on the aborted process
This commit was SVN r5112.
2005-03-31 19:39:02 +00:00
Tim Woodall
81e9377c87 use nonblocking send/recv to send terminate message to the daemons
as they may have already exited

This commit was SVN r5111.
2005-03-31 18:53:35 +00:00
Tim Woodall
447d370905 - added proxy resource manager which is loaded when not the seed
- added support to pls fork/rsh modules for terminate_job

This commit was SVN r5110.
2005-03-31 15:47:37 +00:00
Ralph Castain
8686c60223 Fix the subscription system so it correctly deals with triggers at a level (as opposed to comparing two counters). Combined with Brian's latest checkin, this corrects the tendency for orterun to "hang" when one or more processes abnormally terminate.
This commit was SVN r5109.
2005-03-31 14:24:36 +00:00
Brian Barrett
5753c6a47f Only set the state of the processes the daemon was responsible for to
ABORTED if the ssh that started the daemon exited abnormally.  Otherwise,
bad things happen if all the processes on that node exit before the
processes on other nodes.

This patch is bigger than it should be because I had to indent a bunch of code 
when I moved the if statement.

This commit was SVN r5107.
2005-03-31 04:23:55 +00:00
Jeff Squyres
13a4aee1a5 - Add ignore file so that this component is not included in the
tarball
- Guess at usernames for unignore -- will send out mail shortly to
  verify

This commit was SVN r5106.
2005-03-31 03:24:16 +00:00
Jeff Squyres
f783edb272 Add ignore files so that these components are not included in the
tarball.

This commit was SVN r5105.
2005-03-31 03:23:22 +00:00
Brian Barrett
6c476d7822 * fix free() of NULL when no MPI applications are started
This commit was SVN r5104.
2005-03-30 18:03:08 +00:00
George Bosilca
c7fe83c845 The condition to be able to use the user buffer for the receive is to have a contiguous datatype AND to have
it's extent equal to the size.

This commit was SVN r5103.
2005-03-30 15:03:46 +00:00
George Bosilca
9100aa945d Do not forget to set the upper bound for predefined datatypes.
This commit was SVN r5102.
2005-03-30 15:02:34 +00:00
Ralph Castain
e3471e1342 Update the unit test matrix
This commit was SVN r5101.
2005-03-30 14:16:02 +00:00
Jeff Squyres
8a63a3d3d8 Make the test to find sched_yield() better so that we don't always
link in -lr on linux.

This commit was SVN r5100.
2005-03-30 01:43:16 +00:00
Brian Barrett
043e7682d2 * Make sure to free the callbacks array during finalize
* add MCA parameter (OMPI_MCA_mpi_yield_when_idle) to cause sched_yield()
  to be called when the progress engine is called and nothing happens.
  Default is to call sched_yield().
* add MCA parameter (OMPI_MCA_mpi_event_tick_rate) to adjust the rate
  at which the event library is called from ompi_progress.  When set
  to 0, the event library will never be ticked.  When set to 1, the
  event library will be progressed every time.  2 every other, etc.

The MCA parameters are only in effect from end of MPI_Init to start of
MPI_Finalize.

This commit was SVN r5099.
2005-03-30 01:40:26 +00:00
Brian Barrett
e00782ed7d * add missing string.h include and copyright header
This commit was SVN r5098.
2005-03-30 01:31:30 +00:00
Jeff Squyres
5eb51821c0 Somehow missed editing these bindings in the big f77 commit the other
day (to convert "void*" -> real function pointer types, and "char*
extra_state" to "MPI_Fint* extra_state").

George still gets to cleanup/finish MPI_REGISTER_DATAREP().  :-)

This commit was SVN r5097.
2005-03-29 22:24:51 +00:00
Ralph Castain
35b4fc02e2 Cleanup up some of the util unit tests. Fix a detected problem in orte_os_path for relative path names.
This commit was SVN r5096.
2005-03-29 19:58:32 +00:00
Tim Woodall
72b3a823c3 need to use integer when passing jobid on command line
This commit was SVN r5095.
2005-03-29 19:41:29 +00:00
Tim Woodall
f4f24cba9f cleanup, correct acks when forwarding through an intermediary
This commit was SVN r5094.
2005-03-29 19:40:38 +00:00
Ralph Castain
5f2051caad Fix it so that failure of getpwent defaults to using a string version of the uid - so session directories still work on bizarre systems that don't push the passwd file to the backend nodes...
This commit was SVN r5093.
2005-03-29 19:13:28 +00:00
Jeff Squyres
037e593f34 Fix one more minor problem with a possible overflow in error checking
This commit was SVN r5092.
2005-03-29 18:06:11 +00:00
George Bosilca
aaf1286847 Correct the local send/recv function. It's mostly used on the collective functions.
This commit was SVN r5091.
2005-03-29 17:50:38 +00:00
Tim Woodall
a9e962d0a4 for now (dump daemons output to /dev/null) - need to fix this to use the
session directory

This commit was SVN r5090.
2005-03-29 15:47:24 +00:00
Brian Barrett
8f248206e3 * back out some things I was playing with to get better gm numbers. Sorry
about that :/

This commit was SVN r5086.
2005-03-29 13:56:52 +00:00
Brian Barrett
cdbf179d40 * add header files that "go missing" if compiling with optimizations
* Fix one file that didn't have a comment header

This commit was SVN r5085.
2005-03-29 13:50:15 +00:00
Jeff Squyres
1876ecbf23 Add missing <string.h> for memset
This commit was SVN r5084.
2005-03-29 13:25:53 +00:00
Jeff Squyres
4e198561f6 Fix INCFLAGS for VPATH builds.
This commit was SVN r5083.
2005-03-29 02:50:18 +00:00
Jeff Squyres
71d423529a - Removed some useless C++ protection
- Added some protection to portions that should only be used when
  we're building OMPI (not, for example, when mpicc is being used to
  compile a user's MPI application)

This commit was SVN r5082.
2005-03-29 02:48:50 +00:00
Jeff Squyres
b3a75f27f6 Fix some typos in processing configure options
This commit was SVN r5081.
2005-03-29 02:47:43 +00:00
Ralph Castain
dfe49d0fd2 Fix a subtle bug in the registry callback system that was manifesting itself in the singleton case and (randomly) in the multiprocess case.
Update the unit-test-status matrix to include priority.

Add several new registry diagnostics that helped track down the above bug.
M    test/mca/gpr/gpr_triggers.c
M    test/Unit-Test-Status.xls
M    test/Unit-Test-Status.pdf
M    src/mpi/runtime/ompi_mpi_init.c
M    src/mca/oob/base/oob_base_xcast.c
M    src/mca/ns/base/ns_base_nds_env.c
M    src/mca/gpr/replica/api_layer/gpr_replica_dump_api.c
M    src/mca/gpr/replica/api_layer/gpr_replica_api.h
M    src/mca/gpr/replica/communications/gpr_replica_comm.h
M    src/mca/gpr/replica/communications/gpr_replica_remote_msg.c
M    src/mca/gpr/replica/communications/gpr_replica_cmd_processor.c
M    src/mca/gpr/replica/communications/gpr_replica_dump_cm.c
M    src/mca/gpr/replica/gpr_replica_component.c
M    src/mca/gpr/replica/gpr_replica.h
M    src/mca/gpr/replica/functional_layer/gpr_replica_dump_fn.c
M    src/mca/gpr/replica/functional_layer/gpr_replica_fn.h
M    src/mca/gpr/replica/functional_layer/gpr_replica_trig_ops_fn.c
M    src/mca/gpr/replica/functional_layer/gpr_replica_messaging_fn.c
M    src/mca/gpr/replica/functional_layer/gpr_replica_segment_fn.c
M    src/mca/gpr/proxy/gpr_proxy_dump.c
M    src/mca/gpr/proxy/gpr_proxy.h
M    src/mca/gpr/proxy/gpr_proxy_component.c
M    src/mca/gpr/gpr_types.h
M    src/mca/gpr/base/base.h
M    src/mca/gpr/base/unpack_api_response/gpr_base_dump_notify.c
M    src/mca/gpr/base/pack_api_cmd/gpr_base_pack_dump.c
M    src/mca/gpr/gpr.h

This commit was SVN r5080.
2005-03-28 22:37:54 +00:00
George Bosilca
28c01fd07e Give a default value if the compiler report uninitialized usage.
This commit was SVN r5079.
2005-03-28 21:25:37 +00:00
George Bosilca
5d41ece75e rc should have a default value.
This commit was SVN r5078.
2005-03-28 21:11:10 +00:00
George Bosilca
2e22edc6ef Remove an unused variable.
This commit was SVN r5077.
2005-03-28 21:04:26 +00:00
George Bosilca
ed059df050 Reenable sched_yield by default.
This commit was SVN r5076.
2005-03-28 21:03:24 +00:00
George Bosilca
66478adbf4 The lock declaration should be protected by OMPI_HAVE_THREAD_SUPPORT
otherwise some compilers complain.

This commit was SVN r5075.
2005-03-28 21:02:02 +00:00
George Bosilca
d338550a37 Include ompi_event_signal_pipe in the #if.
This commit was SVN r5074.
2005-03-28 21:01:20 +00:00
Jeff Squyres
ffc75a623f Remove redundant declaration of orte_soh and move it into
src/mca/soh/base/base.h (similar to most other frameworks).

This commit was SVN r5073.
2005-03-28 20:54:45 +00:00
Jeff Squyres
96a13fd818 Solaris needs -lrt for sched_yield()
This commit was SVN r5072.
2005-03-28 20:52:00 +00:00
Jeff Squyres
79f631b88a Fix a call to AC_REPLACE_FUNC that we previously missed
This commit was SVN r5071.
2005-03-28 20:51:42 +00:00
Jeff Squyres
9d7ed5f7c0 Fix borked prior commit:
- INADDR, not IFADDR.  Duh.
- Accidentally removed another comment; restored.

This commit was SVN r5070.
2005-03-28 20:25:39 +00:00
Brian Barrett
f264618a67 * add a util/output.h include to a bunch of files. There are a couple of
configure options that can combine to mean that the ompi_config_bottom.h
  doesn't include this...

This commit was SVN r5068.
2005-03-28 20:07:19 +00:00
Jeff Squyres
6c316d58dd Don't use non-portable ceilf() function -- use a simple integer
calculation to get the same result instead.

This commit was SVN r5066.
2005-03-28 18:39:02 +00:00
Brian Barrett
a763345491 * fix header declaration so it compiles again
This commit was SVN r5065.
2005-03-28 16:42:56 +00:00
Brian Barrett
a39eac0825 fix up a couple of things in the waitpid function (the patch is bigger than
it looks because I re-word wrapped a couple of long comments):

  - remove the polling of the event library in all cases where the
    condition variables already do so.  The condition variables were
    updated and I didn't update this code to match.  This was only
    causing problems because there were some cases where it was
    causing deadlock-like things with the orte_wait mutex.

  - unlock the orte_wait mutex once we have the status info from
    the pid we were waiting on, rather than holding it until the
    condition variable can be destroyed.  This allows us to poll
    a bit more blindly while waiting for the other thread to finish
    with the condition signal

  - do the condition variable-like unlock / poll / unlock cycle
    when progressing the event library when the condition variable
    doesn't do it for us.

This commit was SVN r5064.
2005-03-28 15:57:11 +00:00
Jeff Squyres
6f0b7d3cac A simpler way to do the same thing.
This commit was SVN r5063.
2005-03-28 15:42:02 +00:00
Jeff Squyres
1826e3c445 Add missing <netdb.h>
This commit was SVN r5062.
2005-03-28 14:49:36 +00:00
Jeff Squyres
25c2da6f7a s/char */const char*/ as it's really more appropriate in some places,
and some compilers complain about it.

This commit was SVN r5061.
2005-03-28 14:49:06 +00:00
Jeff Squyres
e6bf5aa6db Some compilers actually complain about empty header files (actually,
about header files that have no trailing newline).

This commit was SVN r5060.
2005-03-28 14:26:49 +00:00
Jeff Squyres
bb986065fc More bad things that ROMIO's configure should not be doing.
This commit was SVN r5059.
2005-03-28 13:02:23 +00:00
Jeff Squyres
c71d0d8bee Add a check for IFADDR_NONE (Solaris doesn't have it; the man page for
inet_addr() says that it returns -1 upon failure).

This commit was SVN r5058.
2005-03-28 11:55:57 +00:00