1
1

57 Коммитов

Автор SHA1 Сообщение Дата
Jeff Squyres
3f5541349a Add UC copyright
This commit was SVN r5009.
2005-03-24 12:43:37 +00:00
Ralph Castain
e2597395e8 Enable the compound command capability to reduce message loads during startup.
Add two files (one .xls and one .pdf) that track the status of unit test development. Comments/revisions welcomed.

This commit was SVN r4999.
2005-03-23 17:50:12 +00:00
Brian Barrett
6822a519bb * results from initial merge of the tim branch into the trunk. Compiles and
ompi_info works, but that's all that has been tested.

This commit was SVN r4827.
2005-03-14 20:57:21 +00:00
Prabhanjan Kambadur
cc3a8ec302 Making topo component lazy load
This commit was SVN r4659.
2005-03-03 22:42:10 +00:00
Josh Hursey
ba4bcc48db Added two new MCA Parameters:
* mpi_show_mca_params
   If set to true, this turns on the dumping of all MCA parameters when MPI_INIT is called. 
   Only the 'rank 0' processes will print the parameters.

* mpi_show_mca_params_file
   (This value is only used if the first argument is set to true) If this value is non-NULL 
   it specifies the file to put the dump into. This file can then be used as input to mpirun 
   for debugging purposes. If this value is not set (and mpi_show_mca_params is set) then 
   the parameters are dumped to stdout.

I also changed the following parameters to internal=true:
  gpr_base_replica
  ns_base_replica
  pcmclient_env_cellid

This commit was SVN r4475.
2005-02-21 18:56:30 +00:00
Brian Barrett
0d82642b40 * Split thread support build conditionals into MPI threads and progress
threads (defaults to use MPI threads, disable progress threads).  This
  allows us to have MPI threaded support, but without progress threads
  and all that fun stuff.

This commit was SVN r4443.
2005-02-16 17:42:07 +00:00
Prabhanjan Kambadur
2439244f0c These are some changes which will enable dynamic builds to go through on Windows. Most of the changes are in adding/deleting windows symbol exporting things.
This commit was SVN r4377.
2005-02-10 19:08:35 +00:00
Rainer Keller
6ee5a29c2f Add a Stacktrace feature, which figures where/what signal has happened
after MPI-startup.
For this a new mpirun-parameter "mpi_signal" is added, one may specify a
comma-separated list of signals to grab, e.g. mpirun --mca mpi_signal 8,11
will check for SIGFPE and SIGSEGV.
It only finds the first fault (SA_ONESHOT), as after the return the same
fault will occur again.

As printout, the data provided by siginfo_t is printed to STDOUT (yes,
it calls printf ,-]).
Additionally, with glibc, it uses backtrace and backtrace_symbols to 
print the calling stack up to the function in which the signal was raised:

(Rank:0) Going to write to RD_ONLY mmaped shared mem
Signal:11 info.si_errno:0(Success) si_code:2(SEGV_ACCERR)
Failing at addr:0x4020c000
[0] func:/home/rusraink/ompi-gcc/lib/libmpi.so.0 [0x40121afe]
[1] func:./t0 [0x42029180]
[2] func:./t0(__libc_start_main+0x95) [0x42017589]
[3] func:./t0(__libc_start_main+0x49) [0x8048691]

This commit was SVN r4170.
2005-01-26 19:11:46 +00:00
Tim Woodall
0d2784568e fixed I/O forwarding cleanup
This commit was SVN r4033.
2005-01-18 17:32:54 +00:00
Tim Woodall
4bd9538bca temporarily disable I/O forwarding
This commit was SVN r4031.
2005-01-18 16:43:47 +00:00
Tim Woodall
9648b5bc36 starting integration of i/o forwarding framework
This commit was SVN r3986.
2005-01-13 15:30:49 +00:00
Jeff Squyres
9802822b7b Make the opening of the io framework be lazy -- only occurs at the
first invocation of MPI_File_open or MPI_File_delete (whichever is
first).  The io framework is then only closed down if it was
successfully opened.  

This is the first [atomic] step to having a progress thread in the
ROMIO component; it wasn't strictly *necessary*, but it's logically
the same direction and provided a good test case.

This commit was SVN r3895.
2005-01-04 15:43:26 +00:00
Tim Woodall
572df54b2a enable event processing for oob w/ non-thread build
This commit was SVN r3810.
2004-12-14 16:34:10 +00:00
Ralph Castain
ed197f0186 More minor changes that continue to make progress on comm_spawn. Nothing significant - no impact on other operations.
PLEASE NOTE: there are some diagnostic messages in oob_xcast that will print out. Please don't have a cow about them - they won't hurt nor injure anyone, and it's just there for a little while to help Tim and I debug a problem. Just didn't want to create yet another MCA parameter to debug 10 lines of code. :-) 

This commit was SVN r3756.
2004-12-09 04:54:37 +00:00
Ralph Castain
5e560cb148 Fix a problem that affected attributes - we were missing the vm_register function call.
This commit was SVN r3692.
2004-12-03 21:05:22 +00:00
Ralph Castain
bf4bfd7472 This will have zero impact on current operations. It adds some functionality for monitoring system state-of-health to support the Eclipse interface project. No use is made of that functionality in the system yet, but this will come along soon. I'll provide more info on exactly what information is now being stored later.
This commit was SVN r3672.
2004-11-30 16:27:32 +00:00
Jeff Squyres
616269a9be Add HLRS copyright
This commit was SVN r3665.
2004-11-28 20:09:25 +00:00
Jeff Squyres
e9ed717748 First cut at copyrights: IU, UTK, and some OSU. LANL and HLRS still
pending.

This commit was SVN r3655.
2004-11-22 01:38:40 +00:00
Ralph Castain
bf9087d9d1 The merged main trunk and gpr integration branch. Tested on Mac only so far - will check out and test on Linux. If that has a problem, will back all changes out (again), but I think we have this one correct. Will send out a more complete change notice once testing is complete.
This commit was SVN r3644.
2004-11-20 19:12:43 +00:00
Tim Woodall
c80818463f use oob barrier for mpi init
This commit was SVN r3606.
2004-11-17 23:37:49 +00:00
Tim Woodall
f78fb7a029 oops - should default to off for non-threaded build
This commit was SVN r3605.
2004-11-17 23:21:18 +00:00
Tim Woodall
70d1c1aafd rather large commit:
- change mca_ptl_base_header_t definition to decrease the
  header size for small messages. note that this requires
  all ptls to be updated. tcp/self/sm/mx have been changed,
  gm/ib/quadrics will be broken by this commit. george and
  mitch have volunteered to make the required changes to gm/ib
- revised matching logic to reduce function call overhead
- changes to tcp/self/sm/mx ptls to support the revised headers

This commit was SVN r3602.
2004-11-17 22:47:08 +00:00
Brian Barrett
23a6d5bb60 * roll back r3584 (gpr changes to reduce floods) as it appears to cause
some instability on Linux

This commit was SVN r3587.

The following SVN revision numbers were found above:
  r3584 --> open-mpi/ompi@52add381d0
2004-11-17 02:30:07 +00:00
Brian Barrett
52add381d0 * Merge over the gpr changes Ralph has made on the gpr-integration branch.
This may trigger a complete rebuild :(.  Short overview of changes:

  - reduce number of network slams at startup
  - prevent gpr from hanging when doing process death code
  - general gpr cleanups

This commit was SVN r3584.
2004-11-16 22:53:33 +00:00
Jeff Squyres
7f3790615a - Implement MPI_Is_thread_main
- Add new thread function to compare current thread to an
  ompi_thread_t*
- Save current thread ID in MPI_INIT*
- Add first cut of solaris thread functions
- Add stubs for missing Windows thread functions

This commit was SVN r3569.
2004-11-15 20:03:14 +00:00
George Bosilca
9659288e74 I hate waiting on the airports. SO I start doing something usefull ...
I remove a lot of inter-dependence, I use the struct_t type.
BEWARE not all the function are ready.

This commit was SVN r3524.
2004-11-05 07:52:30 +00:00
Prabhanjan Kambadur
e2e3bead65 fixing up small mistakes
This commit was SVN r3399.
2004-10-28 19:12:45 +00:00
Tim Woodall
847c08fda5 - for non-threaded builds - set progress to be blocking for non-mpi apps
- reorg MX

This commit was SVN r3383.
2004-10-28 15:40:46 +00:00
Tim Woodall
f557b215ee reimplemented module exchange to use registry publish/subscribe
This commit was SVN r3140.
2004-10-14 20:50:06 +00:00
Jeff Squyres
722e51aa2c Oops -- assignment was the wrong way around.
This commit was SVN r3021.
2004-10-10 00:03:42 +00:00
Jeff Squyres
80b38390ab Enable proper f2c <--> c2f MPI_Request translation. Pick up f2c <-->
c2f MPI_Status translation along the way.  This should enable Fortran
MPI apps that use non-blocking communication to start working.

This commit was SVN r2996.
2004-10-08 17:12:36 +00:00
Jeff Squyres
03dc8e21fb Silence meaningless compiler warning
This commit was SVN r2966.
2004-10-06 23:40:30 +00:00
Ralph Castain
3c92d18fc7 Consolidate the RTE startup sequence into a single function call for simpler maintenance. We seem to have this debugged enough now to commonize the startup across the various programs. Modify mpi_init, mpirun, openmpi, ompid, and ompiconsole accordingly.
This commit was SVN r2910.
2004-10-01 22:22:21 +00:00
Edgar Gabriel
05a28efd1f first cut on the comm_spawn mechanism. It doesn't work yet
(and I don't know why), but it also doesn't seem to break anything else...

This commit was SVN r2874.
2004-09-29 12:41:55 +00:00
Ralph Castain
4c0053579d ka-ching
This commit was SVN r2828.
2004-09-23 14:35:02 +00:00
Ralph Castain
0d4e6482cd Continuing the cleanup process. Few minor fixes here and there - mostly just NULLing pointers that were free'd. Console now can connect to any universe, regardless of scope.
This commit was SVN r2734.
2004-09-17 00:59:14 +00:00
Ralph Castain
f6dc129754 Allow mpirun2 and mpi_init to cleanly detect and join an existing universe. Will continue testing to quickly move away from a non-responsive existing universe.
This commit was SVN r2729.
2004-09-16 19:45:32 +00:00
Tim Woodall
ad7db4e1cb restored call to ompi_rte_register
This commit was SVN r2699.
2004-09-16 08:38:24 +00:00
Ralph Castain
d0e308fbc4 First attempt to thread safe the registry and name server subsystems. Comment out the duplicate calls to register processes in mpi_init and mpirun2.
This commit was SVN r2697.
2004-09-16 04:14:35 +00:00
Jeff Squyres
eb4279559e Some fixes for the attribute code:
- move the attribute init section in ompi_mpi_init() down below where
  communicators are setup (we need MPI_COMM_WORLD to be setup before
  attributes and keyvals are setup)
- removed confusing extra wrapper class around ompi_hash_table_t;
  looks like it was a victim of slow eroding of members so I put took
  it out back and put it out of its misery
- added preliminary definitions for all the pre-defined keyvals.
  Still need more work here to assign their initial values, but I
  think Edgar was running into an atrribute problem and it may have
  been that the pre-defined attrs didn't yet exist.
- removed some LAM-specific predefined keyvals from mpi.h

This commit was SVN r2695.
2004-09-16 00:00:09 +00:00
Ralph Castain
c65619e294 Move the enviro parsing function a little bit. Doesn't seem to be causing a problem with a dynamic build under Mac, but Jeff noted problem with a static build under Linux. Weird....
This commit was SVN r2684.
2004-09-15 19:43:32 +00:00
Ralph Castain
70dae461e4 MPI_Init will now detect and join a persistent universe - hooray! Fixed the session_dir cleanup process so it is kinder to the universe-setup file (i.e., leaves it alone), thus allowing persistent universes to retain their contact info on the session_dir tree. Adjusted mpirun2, ompid, and ompiconsole accordingly.
Put some error protection in ompi_rte_monitor.

This commit was SVN r2678.
2004-09-15 16:33:36 +00:00
Ralph Castain
f7fac7f214 Oops....forgot to take out the diagnostic printouts. Sorry.
This commit was SVN r2662.
2004-09-14 14:28:20 +00:00
Ralph Castain
069682e046 A bunch of minor changes, mostly adding diagnostics. Just wanted to checkpoint so I can start fresh since there now seem to be problems in the tree with mpirun2.
Fixed ompid so it reissues the non-blocking receive - should now be close to ready for primetime. Fixed some logic in the svc framework that wasn't checking properly for action flags. 

This commit was SVN r2660.
2004-09-14 14:21:04 +00:00
Ralph Castain
c6cbe33d50 Some of these didn't really change - I was just in/out of them for diagnostics while chasing a bug. Got caught by my good buddy Tim again :) on his parse_contact_info function, which requires that the space for the answer be allocated in advance. Sigh. Anyway, mpirun2 now works again. My apologies if you tried it in the last few hours and found it didn't.
Also removed the mpirun3 directory since we are basically dragging mpirun2 along with us - no need to create a new version after all.

Made a few changes to the universe info structure, eliminating the "webserver" and "socket" fields since we will do those contacts through the oob channel. Also changed the "silent_mode" field to "console" since silent mode is the default - the flag needs to tell you to turn the console on, not off.

Parse environ function now gets the ns and gpr replica contact info and loads it in the proper places to hand it off to the respective components, thus allowing me to check connection to them as part of determining if the named universe already exists. Changed the local_universe_exists function accordingly and gave it a new name (since the replicas may not be local). This name will shortly be changed to "ompi_rte_join_universe" as I complete the logic for doing that function.

Please let me know if you see any problems. I successfully ran some trivial multi-process functions in both mpirun2 and singleton modes, and ran the seed daemon as well, so I think it should all be okay.

This commit was SVN r2611.
2004-09-11 12:56:52 +00:00
Ralph Castain
106e07f759 Some reorganization of the startup process functions that is transparent to anyone using mpirun2 and/or running as a singleton. Please note that the old mpirun script may well not work any more - I have not been trying to keep that one running.
For those of you looking into the guts of these functions, the most visible changes are:

- raising the assignment of the process name to a higher level, taking it out of the "hole" it had fallen into. We've been having problems with multiple functions assigning the process name. This is understandable - lots of workarounds were implemented in the early development stages. However, it was becoming hard to determine WHEN the name was being defined - it was being hidden under too many layers of function calls. Hence, it is now assigned in the three primary programs in a very visible fashion. Hopefully, we can now chase down all the other places and get rid of them.

- similarly, I raised the visibility of when the session directory gets constructed to ensure it doesn't get done at the wrong time and/or multiple times.

- created a new function that parses all the non-mca level environmental variables and assigns the info into the corresponding structures. I have also included notes in this function and in the various ompi_rte_init_stage functions about proper ordering.

- modified the rte cmd line parsers to store the options they find into the environment so they can be passed along later

That about does it.

This commit was SVN r2589.
2004-09-10 03:21:03 +00:00
Ralph Castain
c1ba40c631 Fix mpirun2 and ompi_mpi_init to be fully backward compatible. All required values are now passed via environmental parameters, and the receiving parties know what to do with them.
Added a field to the ompi_rte_node_schedule_t structure to keep track of the number of items on the environ list, thus making it easier to append more things to it. Adjusted the mca_pcm_base_build_base_env function correspondingly to take that field as an additional argument.

Changed mpirun2 to a .c program for convenience since it wasn't using any c++ features anyway.

This commit was SVN r2561.
2004-09-09 15:23:41 +00:00
Jeff Squyres
86e3cfa85d Remove all remaining traces of the common command line utility. It
doesn't do what it was designed for, and therefore wasn't useful, so
per discussion with Ralph last night, we decided to scrap it.

This commit was SVN r2539.
2004-09-08 15:02:35 +00:00
Ralph Castain
e8c36d02c9 Not as bad as this all may look. Tim and I made a significant change to the way we handle the startup of the oob, the seed, etc. We have made it backwards-compatible so that mpirun2 and singleton operations remain working. We had to adjust the name server and gpr as well, plus the process_info structure.
This also includes a checkpoint update to openmpi.c and ompid.c. I have re-enabled the ompid compile.

This latter raises an important point. The trunk compiles the programs like ompid just fine under Linux. It also does just fine for OSX under the dynamic libraries. However, we are seeing errors when compiling under OSX for the static case - the linker seems to have trouble resolving some variable names, even though linker diagnostics show the variables as being defined. Thus, a warning to Mac users that you may have to locally turn things off if you are trying to do static compiles. We ask, however, that you don't commit those changes that turn things off for everyone else - instead, let's try to figure out why the static compile is having a problem, and let everyone else continue to work.

Thanks
Ralph

This commit was SVN r2534.
2004-09-08 03:59:06 +00:00
Jeff Squyres
3498f7a283 First cut of the Show Help Subsystem (SHS)
- see src/util/show_help.h for details (doxygen); main function call
  is ompi_show_help()

- text message files are expected to be located in $pkgdatadir
  (usually $prefix/share/openmpi).  Anyone can install a text file in
  $pkgdatadir with their message(s) in it and then have them displayed
  via ompi_show_help().  "pkgdata_DATA" is the keyword to use in
  Makefile.am's, for example (from src/mca/base/Makefile.am):

    pkgdata_DATA = help-mca-base.txt

- added a few examples in the code base using ompi_show_help(), but
  not too many -- can convert more "show_help" comments in the code
  over time; no huge rush.  :-)
- no i18n-like support yet; waiting for advice and consensus from
  other developers

This commit was SVN r2519.
2004-09-05 16:05:37 +00:00