1
1
Граф коммитов

22 Коммитов

Автор SHA1 Сообщение Дата
Brian Barrett
77c65d69cc * Merge changes from tim branch from r 4821 to 4892. Tree can now run
MPI and non-ORTE applications for RSH on one node with or without
  threads.  I think we're approaching convergence with the tim branch

This commit was SVN r4895.
2005-03-18 03:43:59 +00:00
Brian Barrett
6822a519bb * results from initial merge of the tim branch into the trunk. Compiles and
ompi_info works, but that's all that has been tested.

This commit was SVN r4827.
2005-03-14 20:57:21 +00:00
Prabhanjan Kambadur
46c2c11680 Function prototypes and global symbols need to be protected from C++ name mangling
This commit was SVN r4165.
2005-01-26 18:23:26 +00:00
Prabhanjan Kambadur
9ac9f15537 These are the changes after the review with Jeff. Mostly are fixes for OOB and TCP
This commit was SVN r4070.
2005-01-20 00:03:23 +00:00
Ralph Castain
72e1161605 Clean up the notify message class - missing the ompi_object_t element in structure.
This commit was SVN r3774.
2004-12-10 17:53:41 +00:00
Ralph Castain
2ee47e0708 Some cleanup at the end of the comm_spawn work.
Comm_spawn is now fully functional. I'll send out a separate message about some of the problems encountered, and resulting action items.

This commit was SVN r3770.
2004-12-10 02:34:19 +00:00
Ralph Castain
ed197f0186 More minor changes that continue to make progress on comm_spawn. Nothing significant - no impact on other operations.
PLEASE NOTE: there are some diagnostic messages in oob_xcast that will print out. Please don't have a cow about them - they won't hurt nor injure anyone, and it's just there for a little while to help Tim and I debug a problem. Just didn't want to create yet another MCA parameter to debug 10 lines of code. :-) 

This commit was SVN r3756.
2004-12-09 04:54:37 +00:00
Ralph Castain
d21c0027df Well, we are getting closer to resolving the comm_spawn problem. For the benefit of those that haven't been in the midst of this discussion, the problem is that this is the first case where the process starting a set of processes has not been mpirun and is not guaranteed to be alive throughout the lifetime of the spawned processes. This sounds simple, but actually has some profound impacts.
Most of this checkin consists of more debugging stuff. Hopefully, you won't see any printf's that aren't protected by debug flags - if you do, let me know and I'll take them out with my apologies.

Outside of debugging, the biggest change was a revamp of the shutdown process. For several reasons, we had chosen to have all processes "wait" for a shutdown message before exiting. This message is typically generated by mpirun, but in the case of comm_spawn we needed to do something else. We have decided that the best way to solve this problem is to:

(a) replace the shutdown message (which needed to be generated by somebody - usually mpirun) with an oob_barrier call. This still requires that the rank 0 process be alive. However, we terminate all processes if one abnormally terminates anyway, so this isn't a problem (with the standard or our implementation); and

(b) have the state-of-health monitoring subsystem issue the call to cleanup the job from the registry. Since the state-of-health subsystem isn't available yet, we have temporarily assigned that responsibility to the rank 0 process.  Once the state-of-health subsystem is available, we will have it monitor the job for all-processes-complete and then it can tell the registry to cleanup the job (i.e., remove all data relating to this job).

Hope that helps a little. I'll put all this into the design docs soon.

This commit was SVN r3754.
2004-12-08 21:44:41 +00:00
Jeff Squyres
616269a9be Add HLRS copyright
This commit was SVN r3665.
2004-11-28 20:09:25 +00:00
Jeff Squyres
e9ed717748 First cut at copyrights: IU, UTK, and some OSU. LANL and HLRS still
pending.

This commit was SVN r3655.
2004-11-22 01:38:40 +00:00
Ralph Castain
bf9087d9d1 The merged main trunk and gpr integration branch. Tested on Mac only so far - will check out and test on Linux. If that has a problem, will back all changes out (again), but I think we have this one correct. Will send out a more complete change notice once testing is complete.
This commit was SVN r3644.
2004-11-20 19:12:43 +00:00
Brian Barrett
23a6d5bb60 * roll back r3584 (gpr changes to reduce floods) as it appears to cause
some instability on Linux

This commit was SVN r3587.

The following SVN revision numbers were found above:
  r3584 --> open-mpi/ompi@52add381d0
2004-11-17 02:30:07 +00:00
Brian Barrett
52add381d0 * Merge over the gpr changes Ralph has made on the gpr-integration branch.
This may trigger a complete rebuild :(.  Short overview of changes:

  - reduce number of network slams at startup
  - prevent gpr from hanging when doing process death code
  - general gpr cleanups

This commit was SVN r3584.
2004-11-16 22:53:33 +00:00
Prabhanjan Kambadur
650b04c4b4 changes:
--------
1. malloc casts to the right pointers
2. function parameter casts in the components (eg., recv requires a (char *) typecast 
   else cL compiler barfs)
3. added my own errno indirection. this is only in oob/tcp module. ompi_errno is #defined
   ro errno in unix land and to a function ompi_get_error which returns the equivalent
   error code.
4. implemented our own fcntl to prevent spaghetti coding. this currently only takes
   F_GETFL and F_SETFL arguments, does nothing on F_GETFL and sets the nonblocking 
   option on F_SETFL
5. Moved some extern declarations to global scope since the CL compiler does not do 
   the right things if they are declared and used in static inline functions.
6. Protection around some header files. changed sys/errno to errno.
7. defined in_proto_t (unsigned uint16_t) to DWORD ... comments are welcome

This commit was SVN r3394.
2004-10-28 18:13:43 +00:00
David Daniel
5c4c277266 Numerous niggles related to building on Solaris
This commit was SVN r2234.
2004-08-19 19:30:53 +00:00
Ralph Castain
49e8e16148 Fix the registry software to compile - get all the naming errors finally out, etc. The functionality is not present, so don't use it yet - nothing will happen. I'll be restoring the functionality over the next week or two.
This commit was SVN r2146.
2004-08-15 05:49:55 +00:00
Ralph Castain
c0121cb927 Major update to the general purpose registry. Cleaned up the mess from name changes, begin building the functionality. Long way to go....
I have to commit this to cleanup a break in my tree. I'm hoping it won't break the compile of the tree, but will fix it as quickly as possible.

Jeff - you are welcome to set an "ignore" on the gpr if you like - I'll let you know when I've got the "kinks" out.

This commit was SVN r2145.
2004-08-15 03:33:13 +00:00
Jeff Squyres
eb8cba98af - massive change for module<-->component name fixes throughout the
code base.
  - many (most) mca type names have "component" or "module" in them,
    as relevant, just to further distinguish the difference between
    component data/actions and module data/actions.  All developers
    are encouraged to perpetuate this convention when you create
    types that are specific to a framework, component, or module
  - did very little to entire framework (just the basics to make it
    compile) because it's just about to be almost entirely replaced
  - ditto for io / romio
  - did not work on elan or ib components; have to commit and then
    convert those on a different machine with the right libraries and
    headers
- renamed a bunch of *_module.c files to *_component.c and *module*c
  to *component*c (a few still remain, e.g., ptl/ib, ptl/elan, etc.)
- modified autogen/configure/build process to match new filenames
  (e.g., output static-components.h instead of static-modules.h)
- removed DOS-style cr/lf stuff in ns/ns.h
- added newline to end of file src/util/numtostr.h
- removed some redundant error checking in the top-level topo
  functions
- added a few {} here and there where people "forgot" to put them in
  for 1 line blocks ;-)
- removed a bunch of MPI_* types from mca header files (replaced with
  corresponding ompi_* types)
- all the ptl components had version numbers in their structs; removed
- converted a few more elements in the MCA base to use the OBJ
  interface -- removed some old manual reference counting kruft

This commit was SVN r1830.
2004-08-02 00:24:22 +00:00
Ralph Castain
066063fcef Bring the name server files into the repository so proc_info can compile.
This commit was SVN r1506.
2004-06-29 21:17:10 +00:00
Ralph Castain
5f7c14cb36 Updates the documentation and include file for the GPR. The documentation in base.h describes the user interfaces for the registry. Unfortunately, we haven't gotten doxygen to pick it up correctly, so for now you'll have to read it in the base.h file directly.
Sorry about that - we'll keep trying to make it work.

This commit was SVN r1319.
2004-06-16 17:01:24 +00:00
Ralph Castain
0f11e4fc33 Checkpoint the gpr stuff for the night...
This commit was SVN r1304.
2004-06-16 05:41:13 +00:00
Ralph Castain
bf86e88d5f Fix error in directory name - sorry
This commit was SVN r1238.
2004-06-12 10:47:05 +00:00