1
1

12 Коммитов

Автор SHA1 Сообщение Дата
Ralph Castain
3e19906b95 Begin to "leak" the changes to the registry and supporting subsystems to resolve the flood situation and support abnormal terminations. These changes just define a new message structure for returning all startup/shutdown information in a single broadcast-like transmission. Shouldn't have any impact on existing code as the message object isn't used yet.
This commit was SVN r3311.
2004-10-25 13:36:09 +00:00
Jeff Squyres
d324a7725c - Add #if protection around non-portable system .h files
- Add #include "ompi_config.h" to all .c files, and ensure that it's
  the first #included file
- remove a few useless #if HAVE_CONFIG_H checks

This commit was SVN r3229.
2004-10-20 01:03:09 +00:00
Ralph Castain
ad395fa825 First commit of the revised startup system.
Having noted the existence of the wondrous Open MPI "statistics" tracker, I feel compelled to commit these changes one file at a time. This will, of course, provide me with wonderful statistics for the number of commits I have done, thus ensuring that those who watch such things become truly impressed by the magnitude of my contribution.

Of course, I will also do a commit for each time I correct a typo in my own software, and each time I add a comment to a file - a comment that, ordinarily, one might expect to have already been in place before the first commit. But then....I wouldn't look as impressive if I did it that way! No, no...far better to add the comments - and do a commit after each one - separately!

So, enjoy all.
Ralph
aka. The longtime Don Quixote crusader against the asinine use of meaningless statistics in place of true performance metrics.

This commit was SVN r2824.
2004-09-23 14:32:31 +00:00
Brian Barrett
41e17e2758 * rename pack.{c,h} to bufpack.{c,h} because there was already a pack.c in
src/mpi/c and you can't have two object files with the same name in
  the same library

This commit was SVN r2782.
2004-09-20 19:55:01 +00:00
George Bosilca
efc09dfc94 increase timeout
This commit was SVN r2778.
2004-09-20 17:29:29 +00:00
Ralph Castain
0d4e6482cd Continuing the cleanup process. Few minor fixes here and there - mostly just NULLing pointers that were free'd. Console now can connect to any universe, regardless of scope.
This commit was SVN r2734.
2004-09-17 00:59:14 +00:00
Ralph Castain
f6dc129754 Allow mpirun2 and mpi_init to cleanly detect and join an existing universe. Will continue testing to quickly move away from a non-responsive existing universe.
This commit was SVN r2729.
2004-09-16 19:45:32 +00:00
Ralph Castain
069682e046 A bunch of minor changes, mostly adding diagnostics. Just wanted to checkpoint so I can start fresh since there now seem to be problems in the tree with mpirun2.
Fixed ompid so it reissues the non-blocking receive - should now be close to ready for primetime. Fixed some logic in the svc framework that wasn't checking properly for action flags. 

This commit was SVN r2660.
2004-09-14 14:21:04 +00:00
Ralph Castain
a14ee7eb48 Checkpoint the console and daemon.
Folks - there appears to be something unreliable about communication with the daemon at the moment. We are trying to track it down. Meantime, please be patient if experimenting with it.

This commit was SVN r2633.
2004-09-13 16:51:53 +00:00
Ralph Castain
57ceb5225e Workaround the mca_oob_ping problem by doing rapid multiple checks - works just fine.
We now have the ability to generate and join a persistent universe. You can create one in two ways:

(a) issue the "openmpi" command. This will fork/exec a seed daemon on your local host. You can specify a universe name or else it will just use the default.

(b) issue the "ompid -seed" command. Starts the seed up directly. Takes all the same options as openmpi.

I will be adjusting mpirun2 and mpi_init to allow connection to existing persistent universes, but they don't do it right now. The ompiconsole program simply issues an exit command to the persistent universe, so you can use it to shut the universe down if you like (or a kill -9  - works too).

This commit was SVN r2629.
2004-09-13 14:14:00 +00:00
Ralph Castain
55a2576f01 Update some basic functions, mostly with diagnostics.
This commit was SVN r2620.
2004-09-13 01:25:25 +00:00
Ralph Castain
c6cbe33d50 Some of these didn't really change - I was just in/out of them for diagnostics while chasing a bug. Got caught by my good buddy Tim again :) on his parse_contact_info function, which requires that the space for the answer be allocated in advance. Sigh. Anyway, mpirun2 now works again. My apologies if you tried it in the last few hours and found it didn't.
Also removed the mpirun3 directory since we are basically dragging mpirun2 along with us - no need to create a new version after all.

Made a few changes to the universe info structure, eliminating the "webserver" and "socket" fields since we will do those contacts through the oob channel. Also changed the "silent_mode" field to "console" since silent mode is the default - the flag needs to tell you to turn the console on, not off.

Parse environ function now gets the ns and gpr replica contact info and loads it in the proper places to hand it off to the respective components, thus allowing me to check connection to them as part of determining if the named universe already exists. Changed the local_universe_exists function accordingly and gave it a new name (since the replicas may not be local). This name will shortly be changed to "ompi_rte_join_universe" as I complete the logic for doing that function.

Please let me know if you see any problems. I successfully ran some trivial multi-process functions in both mpirun2 and singleton modes, and ran the seed daemon as well, so I think it should all be okay.

This commit was SVN r2611.
2004-09-11 12:56:52 +00:00