This may trigger a complete rebuild :(. Short overview of changes:
- reduce number of network slams at startup
- prevent gpr from hanging when doing process death code
- general gpr cleanups
This commit was SVN r3584.
- register non-blocking recv for process starter whenever a new spawn
occurs.
- send kill message when rte_kill_job or kill_proc is called
- pcm does its mojo to result in the death of the processes
This commit was SVN r3458.
1. header file and source file protections using #ifdef WIN32
2. new files and directories to support windows functionality
3. appropritate linkage symbols added (OMPI_DECLSPEC) for windows
4. some functions are unimplemented on the windows side. this is mostly
because there might not be need to implement it in windows land. eg., forking
a daemon off
5. Introduced locking mechanisms for windows
This commit was SVN r3286.
* fix typo in error message for spawning processes
* Remove the name field from the global ompi_process_info struct, replacing
usage with calls to ompi_rte_get_self(). Cleaned up the resulting logic
in ompi_rte_init() to make it slightly simpler when dealing with the
singleton case. Reduces data duplication and I believe fixes bug
#1009 as a nice side effect.
This commit was SVN r3230.
- Add #include "ompi_config.h" to all .c files, and ensure that it's
the first #included file
- remove a few useless #if HAVE_CONFIG_H checks
This commit was SVN r3229.
* memory leak cleanups
* implement rsh's kill_proc and kill_job for the case where we
keep the ssh connections alive. At least, I think this will work.
Need to test some more.
This commit was SVN r2884.
* make hostfile llm properly deal with over subscribe situation. Rather
than returning smaller than requested (which is no longer possible as
it made for a book keeping nightmaer and no one was paying attention
to it anyway), we just over subscribe the nodes. In the future, we
need to add a flag to allocate resources as to whether to allow
over subscription (if the resource allocator permits - clearly rsh
does, rms not so much).
This commit was SVN r2808.
sets running at once - requires an additional step in spawning to get a
handle (that will contain multiple pcms when we support multi-cell)
* change the selection logic of the pcms to not care about setting threads,
but instead to select based on the selected thread level, since it
would be a little late by the time we did the selection for pcms.
* started the long process of cleaning up the rsh pcm so that it
actually kills processes and things. Still doesn't do anything useful,
but getting to the point where that might be possible
This commit was SVN r2794.
* remove the ns param switch - always use the ns at this point
* clean up some of the evil rms code that wasn't multi-pcm safe. still
have somme work on this front
This commit was SVN r2779.
Fixed ompid so it reissues the non-blocking receive - should now be close to ready for primetime. Fixed some logic in the svc framework that wasn't checking properly for action flags.
This commit was SVN r2660.
Also removed the mpirun3 directory since we are basically dragging mpirun2 along with us - no need to create a new version after all.
Made a few changes to the universe info structure, eliminating the "webserver" and "socket" fields since we will do those contacts through the oob channel. Also changed the "silent_mode" field to "console" since silent mode is the default - the flag needs to tell you to turn the console on, not off.
Parse environ function now gets the ns and gpr replica contact info and loads it in the proper places to hand it off to the respective components, thus allowing me to check connection to them as part of determining if the named universe already exists. Changed the local_universe_exists function accordingly and gave it a new name (since the replicas may not be local). This name will shortly be changed to "ompi_rte_join_universe" as I complete the logic for doing that function.
Please let me know if you see any problems. I successfully ran some trivial multi-process functions in both mpirun2 and singleton modes, and ran the seed daemon as well, so I think it should all be okay.
This commit was SVN r2611.
at the same time and multiple modules of the same component to be loaded
at the same time (but not launching procs in the same job).
- add a "this" pointer to all the PCM functions
- make base select() function return a list of selected pcms, based on
given criteria bitmask
- update all the pcms to match
* Add a insert before position function to the ompi_list code
This commit was SVN r2590.
For those of you looking into the guts of these functions, the most visible changes are:
- raising the assignment of the process name to a higher level, taking it out of the "hole" it had fallen into. We've been having problems with multiple functions assigning the process name. This is understandable - lots of workarounds were implemented in the early development stages. However, it was becoming hard to determine WHEN the name was being defined - it was being hidden under too many layers of function calls. Hence, it is now assigned in the three primary programs in a very visible fashion. Hopefully, we can now chase down all the other places and get rid of them.
- similarly, I raised the visibility of when the session directory gets constructed to ensure it doesn't get done at the wrong time and/or multiple times.
- created a new function that parses all the non-mca level environmental variables and assigns the info into the corresponding structures. I have also included notes in this function and in the various ompi_rte_init_stage functions about proper ordering.
- modified the rte cmd line parsers to store the options they find into the environment so they can be passed along later
That about does it.
This commit was SVN r2589.
host / cpu information down into a handle that need not exist when
the llm isn't being used. Fix all the test cases and whatnot to match
This commit was SVN r2490.
structures to make it easier to swap around lists when doing process ->
resource mapping
* Fix spawn interface to take an ompi_list_t* instead of an ompi_list_t
since you can't pass an ompi_list_t by value
* Change allocate_resource to return an ompi_list_t* instead of having
an ompi_list_t** as an argument, since it's a bit cleaner and makes
who should call OBJ_NEW much more clear
* Clean up deallocation in error cases for the llm_base_allocate function
* Update test case for llm to not depend on current environment for
correctness
This commit was SVN r2126.
* Add more of the mpirun shell - still far from functional
* Expand the src/runtime interface to include the parts of the pcm needed
for mpirun
This commit was SVN r1998.
abstraction a little clearer.
* Include mpiruntime.h instead of runtime.h in errhandler.h since only the
MPI stuff was needed - speeds compile times greatly when working on the
RTE...
This commit was SVN r1948.