openmpi

Автор	SHA1	Сообщение	Дата
Josh Hursey	1e678c3f55	per conversation with Ralph and Jeff take out the opal_init_only logic. This commit moves the initalization/finalization of opal_event and opal_progress to opal_init/finalize. These were previously init/final in ORTE which is an abstraction violation. After talking about it we concluded that there are no ordering issues that require these to be init/final in ORTE instead of OPAL. I ran the IBM test suite against this commit and it didn't turn up any new failures so I think it is good to go. Let us know if this causes problems. This commit was SVN r14773.	2007-05-24 21:54:58 +00:00
Ralph Castain	4fff584a68	Commit the orted-failed-to-start code. This correctly causes the system to detect the failure of an orted to start and allows the system to terminate all procs/orteds that did start. The primary change that underlies all this is in the OOB. Specifically, the problem in the code until now has been that the OOB attempts to resolve an address when we call the "send" to an unknown recipient. The OOB would then wait forever if that recipient never actually started (and hence, never reported back its OOB contact info). In the case of an orted that failed to start, we would correctly detect that the orted hadn't started, but then we would attempt to order all orteds (including the one that failed to start) to die. This would cause the OOB to "hang" the system. Unfortunately, revising how the OOB resolves addresses introduced a number of additional problems. Specifically, and most troublesome, was the fact that comm_spawn involved the immediate transmission of the rendezvous point from parent-to-child after the child was spawned. The current code used the OOB address resolution as a "barrier" - basically, the parent would attempt to send the info to the child, and then "hold" there until the child's contact info had arrived (meaning the child had started) and the send could be completed. Note that this also caused comm_spawn to "hang" the entire system if the child never started... The app-failed-to-start helped improve that behavior - this code provides additional relief. With this change, the OOB will return an ADDRESSEE_UNKNOWN error if you attempt to send to a recipient whose contact info isn't already in the OOB's hash tables. To resolve comm_spawn issues, we also now force the cross-sharing of connection info between parent and child jobs during spawn. Finally, to aid in setting triggers to the right values, we introduce the "arith" API for the GPR. This function allows you to atomically change the value in a registry location (either divide, multiply, add, or subtract) by the provided operand. It is equivalent to first fetching the value using a "get", then modifying it, and then putting the result back into the registry via a "put". This commit was SVN r14711.	2007-05-21 18:31:28 +00:00
Brian Barrett	a25ce44dc1	Clean up the preconnect code: * Don't need the 2 process case -- we'll send an extra message, but at very little cost and less code is better. * Use COMPLETE sends instead of STANDARD sends so that the connection is fully established before we move on to the next connection. The previous code was still causing minor connection flooding for huge numbers of processes. * mpi_preconnect_all now connects both OOB and MPI layers. There's also mpi_preconnect_mpi and mpi_preconnect_oob should you want to be more specific. * Since we're only using the MCA parameters once at the beginning of time, no need for global constants. Just do the quick param lookup right before the parameter is needed. Save some of that global variable space for the next guy. Fixes trac:963 This commit was SVN r14553. The following Trac tickets were found above: Ticket 963 --> https://svn.open-mpi.org/trac/ompi/ticket/963	2007-05-01 04:49:36 +00:00
Ralph Castain	18b2dca51c	Bring in the code for routing xcast stage gate messages via the local orteds. This code is inactive unless you specifically request it via an mca param oob_xcast_mode (can be set to "linear" or "direct"). Direct mode is the old standard method where we send messages directly to each MPI process. Linear mode sends the xcast message via the orteds, with the HNP sending the message to each orted directly. There is a binomial algorithm in the code (i.e., the HNP would send to a subset of the orteds, which then relay it on according to the typical log-2 algo), but that has a bug in it so the code won't let you select it even if you tried (and the mca param doesn't show, so you'd really have to try). This also involved a slight change to the oob.xcast API, so propagated that as required. Note: this has only been tested on rsh, SLURM, and Bproc environments (now that it has been transferred to the OMPI trunk, I'll need to re-test it [only done rsh so far]). It should work fine on any environment that uses the ORTE daemons - anywhere else, you are on your own... :-) Also, correct a mistake where the orte_debug_flag was declared an int, but the mca param was set as a bool. Move the storage for that flag to the orte/runtime/params.c and orte/runtime/params.h files appropriately. This commit was SVN r14475.	2007-04-23 18:41:04 +00:00
Rich Graham	f481722bdf	move the code that sets the thread level information before the btl are initialized, so that the btl's have this information for correct setup. This commit was SVN r14258.	2007-04-07 05:06:47 +00:00
Josh Hursey	dadca7da88	Merging in the jjhursey-ft-cr-stable branch (r13912 : HEAD). This merge adds Checkpoint/Restart support to Open MPI. The initial frameworks and components support a LAM/MPI-like implementation. This commit follows the risk assessment presented to the Open MPI core development group on Feb. 22, 2007. This commit closes trac:158 More details to follow. This commit was SVN r14051. The following SVN revisions from the original message are invalid or inconsistent and therefore were not cross-referenced: r13912 The following Trac tickets were found above: Ticket 158 --> https://svn.open-mpi.org/trac/ompi/ticket/158	2007-03-16 23:11:45 +00:00
Tim Prins	dbe82c70d6	Get rid of stale file, and remove the (unused) references to it. This commit was SVN r13771.	2007-02-23 13:50:39 +00:00
Brian Barrett	8b28e5b33d	Allow the OOB to connect between all MPI applications during MPI_INIT without also establishing MPI connectivity. This commit was SVN r13595.	2007-02-09 20:17:37 +00:00
Brian Barrett	262cbbc5c9	Back out r13593, which contained a change that shouldn't be committed. This commit was SVN r13594. The following SVN revision numbers were found above: r13593 --> open-mpi/ompi@81472363ea	2007-02-09 20:13:02 +00:00
Brian Barrett	81472363ea	Allow the OOB to connect between all MPI applications during MPI_INIT without also establishing MPI connectivity. This commit was SVN r13593.	2007-02-09 20:11:40 +00:00
Ralph Castain	c7be9a7121	Complete backout of prior sched_yield and paffinity changes This commit was SVN r13530.	2007-02-07 14:22:37 +00:00
Jeff Squyres	0732c555de	Refs trac:853 Add new function opal_get_num_processors() that will return the number of processors on the local host. Does the Right thing in POSIX environments (to include a special case for OS X), and will shortly do the Right Thing for Windows (this commit includes a change to configure, so I wanted to get that in before the US workday -- the Windows code can some shortly because it won't involve configury changes). This commit was SVN r13506. The following Trac tickets were found above: Ticket 853 --> https://svn.open-mpi.org/trac/ompi/ticket/853	2007-02-06 12:03:56 +00:00
Ralph Castain	3daf8b341b	Fix the sched_yield problem for generic environments. We now determine and set sched_yield during mpi_init based on the following logical sequence: 1. if the user has specified sched_yield, we simply do what we are told 2. if they didn't specify anything, try to get the number of processors on this node. Note that we already now get the number of local procs in our job that are sharing this node - that now comes in through the proc callback and is stored in the ompi_proc_t structures. 3. if we can get the number of processors, compare that to the number of local procs from my job that are sharing my node. If the number of local procs exceeds the number of processors, then set sched_yield to true. If not, then be a hog and set sched_yield to false 4. if we can't get the number of processors, default to conservative behavior and set sched_yield to true. Note that I have not yet dealt with the need to dynamically adjust this setting as more processes are added via comm_spawn. So far, we are only looking within our own job. Given that we have now moved this logic to mpi_init (and away from the orteds), it isn't yet clear to me how a process will be informed about the number of procs in other jobs that are also sharing this node. Something to continue to ponder. This commit was SVN r13430.	2007-02-01 19:31:44 +00:00
Brian Barrett	49bcec64e4	More sm startup time reduction * The real fix, don't leave the OOB in blocking mode during comm_dyn_init(), as it means no progressing MPI events while the event library is waiting for TCP stuff to come in. * Add many comments explaining the reasons for the current ordering This commit was SVN r13422.	2007-02-01 18:47:43 +00:00
Jeff Squyres	0ce78f22fa	Compliments r13351 -- use the new field on the ompi_proc_t struct to know what my local rank is, and therefore set my paffinity ID as appropriate. Specifically, we're no longer relying on the special/secret mpi_paffinity_processor MCA parameter that the orted would set for us. This allows processor affinity to be used in environments where the orted is not used (e.g., bproc, and someday in the hopefully not too-distant future, SLURM). This commit was SVN r13352. The following SVN revision numbers were found above: r13351 --> open-mpi/ompi@a338b7e533	2007-01-29 21:53:04 +00:00
Edgar Gabriel	1359ba9b13	Rewriting much of the errorcode and errorclass code, since - we have to be able to attach a string to an error class, not just to an error code - according to MPI-2 the attribute MPI_LASTUSEDCODE has to be updated everytime you add a new code or a new class. Thus, you have to have single list for both. Thus, we got rid of the error_class structure. In the error-code structure, we can distinguish whether we are dealing with an error code or an error class by looking at the err->code element of the structure. In case its value is MPI_UNDEFINED, the according entry is a class, else it is an error code. All predefined error codes have the code and the class field set to the same value. The test MPI_Add_error_class1 passes now. Fixes trac:418 This commit was SVN r12764. The following Trac tickets were found above: Ticket 418 --> https://svn.open-mpi.org/trac/ompi/ticket/418	2006-12-05 19:07:02 +00:00
Ralph Castain	79bd8a842e	Add xcast timing in mpi_init since we cannot directly measure it (tree-based comm). This commit was SVN r12706.	2006-11-30 15:06:40 +00:00
Ralph Castain	bc4e97a435	First stage in the move to a faster startup. Change the ORTE stage gate xcast into a binary tree broadcast (away from a linear broadcast). Also, removed the timing report in the gpr_proxy component that printed out the number of bytes in the compound command message as the answer was "not much" - reduces the clutter in the data. This commit was SVN r12679.	2006-11-28 00:06:25 +00:00
Brian Barrett	33320b7165	Rework the opal_progress interface to better support dynamic processes and at the same time, remove some of the MPI-related options from OPAL: - provide mechanism to change at runtime whether sched_yield() should be called when the progress engine is idle - provide mechanism for changing the rate at which the event engine is called when there are "no" users of the event engine (ie, when using MPI but not TCP) - fix some function names in the progress engine to better match their intended use (and remove MPI naming scheme) - remove progress_mpi_enable / progress_mpi_disable because we can now use the functions to set the sched_yield and tick rate interfaces - rename opal_progress_events() to opal_progress_set_event_flag() because the first really isn't descriptive of what the function does and I always got confused by it This commit was SVN r12645.	2006-11-22 02:06:52 +00:00
Ralph Castain	6d6cebb4a7	Bring over the update to terminate orteds that are generated by a dynamic spawn such as comm_spawn. This introduces the concept of a job "family" - i.e., jobs that have a parent/child relationship. Comm_spawn'ed jobs have a parent (the one that spawned them). We track that relationship throughout the lineage - i.e., if a comm_spawned job in turn calls comm_spawn, then it has a parent (the one that spawned it) and a "root" job (the original job that started things). Accordingly, there are new APIs to the name service to support the ability to get a job's parent, root, immediate children, and all its descendants. In addition, the terminate_job, terminate_orted, and signal_job APIs for the PLS have been modified to accept attributes that define the extent of their actions. For example, doing a "terminate_job" with an attribute of ORTE_NS_INCLUDE_DESCENDANTS will terminate the given jobid AND all jobs that descended from it. I have tested this capability on a MacBook under rsh, Odin under SLURM, and LANL's Flash (bproc). It worked successfully on non-MPI jobs (both simple and including a spawn), and MPI jobs (again, both simple and with a spawn). This commit was SVN r12597.	2006-11-14 19:34:59 +00:00
Ralph Castain	4e50cdae52	This commit accomplishes two things: 1. Fix the "hang" condition when an application isn't found. It turned out that the ODLS had some difficulty with the process actually not having been started - hence, it never called the waitpid callback. As a result, the "terminated" trigger didn't fire, and so mpirun didn't wake up. With this change, the HNP's errmgr forces the issue by causing the trigger to fire itself when an abort condition occurs. 2. Shift the recording of the pid and the nodename from mpi_init to the orted launcher. This allows programs such as Eclipse PTP to get the pids even for non-MPI applications. In the case of bproc, the pls handles this chore since we don't use orteds in that system. This commit was SVN r12558.	2006-11-11 04:03:45 +00:00
Ralph Castain	c77f6c605e	Update timing reports: 1. Remove timing of xcast from mpi_init 2. Add timing report from oob_xcast on how long it took to send the message This commit was SVN r12428.	2006-11-03 18:55:05 +00:00
Ralph Castain	60e27c77e7	Add some additional timing reporting: 1. Added reporting points around the xcasts in MPI_Init. Note that these times will include time spent waiting for a trigger to fire, which is why the times between stage gates did NOT include these times initially. The inter-stage-gate times still do NOT include the xcast time - the xcast time is reported separately. 2. Added the process vpid on the MPI_Init timing reports for clarity. 3. Added a report from the xcast function on the HNP that outputs the number of bytes in the message being sent to the processes. This commit was SVN r12422.	2006-11-03 16:04:40 +00:00
Ralph Castain	36d4511143	Bring the timing instrumentation to the trunk. If you want to look at our launch and MPI process startup times, you can do so with two MCA params: OMPI_MCA_orte_timing: set it to anything non-zero and you will get the launch time for different steps in the job launch procedure. The degree of detail depends on the launch environment. rsh will provide you with the average, min, and max launch time for the daemons. SLURM block launches the daemon, so you only get the time to launch the daemons and the total time to launch the job. Ditto for bproc. TM looks more like rsh. Only those four environments are currently supported - anyone interested in extending this capability to other environs is welcome to do so. In all cases, you also get the time to setup the job for launch. OMPI_MCA_ompi_timing: set it to anything non-zero and you will get the time for mpi_init to reach the compound registry command, the time to execute that command, the time to go from our stage1 barrier to the stage2 barrier, and the time to go from the stage2 barrier to the end of mpi_init. This will be output for each process, so you'll have to compile any statistics on your own. Note: if someone develops a nice parser to do so, it would be really appreciated if you could/would share! This commit was SVN r12302.	2006-10-25 15:27:47 +00:00
Ralph Castain	0411f9772e	Begin instrumenting for scalability tests. I have added a new MCA param (hey, you can't have too many!) called OMPI_MCA_orte_timing. If set to anything other than zero, the system will report out critical timing loops. At the moment, this includes three measurements: 1. Time spent going through the RDS->RAS->RMAPS, setting up triggers, etc. prior to calling the actual PLS launch function. This is reported out as time to setup job. 2. Time spent in MPI_Init from start of that function (well, right after opal_init) to the place where we send all of our info the registry. Reported out as time from start to exec_compound_cmd 3. Time actually spent executing the compound cmd. Reported out as time to exec_compound_cmd. A few additional timing points will be added shortly. These may eventually be removed or (better) setup with a conditional compile flag. This commit was SVN r11892.	2006-09-29 13:19:44 +00:00
Rainer Keller	80166a9516	- fix typos This commit was SVN r11703.	2006-09-19 07:55:41 +00:00
Ralph Castain	8c7f0ed9ae	Change the SOH to the new State Monitoring and Reporting (SMR) framework. New API's will be appearing in the new framework shortly - this just gets the name change into the system. Other changes: 1. Remove the old xcpu components as they are not functional. 2. Fix a "bug" in orterun whereby we called dump_aborted_procs even when we normally terminated. There is still some kind of bug in this procedure, however, as we appear to be calling the orterun job_state_callback function every time a process terminates (instead of only once when they have all terminated). I'll continue digging into that one. This will require an autogen/configure, I'm afraid. This commit was SVN r11228.	2006-08-16 16:35:09 +00:00
Brian Barrett	cdffc3158d	* only set threads if not running at thread single This commit was SVN r11193.	2006-08-15 15:55:53 +00:00
Jeff Squyres	b6c6d9a2b7	Bring over r10877 and r10881 from the /tmp/tbird branch: r10877: add warm up connection option.. of course this only warms up the first eager btl but this should be adequate for now.. r10881: Consulted with Galen and did a few things: - Fix the algorithm to actually make the connections that we want - Rename the MCA param to mpi_preconnect_all - Cleanup the code a bit: - move the logic to a separate .c file - check return codes properly This commit was SVN r11114. The following SVN revisions from the original message are invalid or inconsistent and therefore were not cross-referenced: r10877 r10877 r10881 r10881	2006-08-04 14:41:31 +00:00
Brian Barrett	a84e557815	Add new loop mode OPAL_EVLOOP_ONELOOP that behaved like OPAL_EVLOOP_ONCE did pre-libevent update. The problem is that the behavior of OPAL_EVLOOP_ONCE was changed by the OMPI team, which them broke things during the update, so it had to be reverted to the old meaning of loop until one event occurs. OPAL_EVLOOP_ONELOOP will go through the event loop once (like EVLOOP_NONBLOCK) but will pause in the event library for a bit (like EVLOOP_ONCE). fixes trac:234 This commit was SVN r11081. The following Trac tickets were found above: Ticket 234 --> https://svn.open-mpi.org/trac/ompi/ticket/234	2006-08-01 22:23:57 +00:00
Jeff Squyres	520147f209	Clean up the Fortran MPI sentinel values per problem reported on the users mailing list: http://www.open-mpi.org/community/lists/users/2006/07/1680.php Warning: this log message is not for the weak. Read at your own risk. The problem was that we had several variables in Fortran common blocks of various types, but their C counterparts were all of a type equivalent to a fortran double complex. This didn't seem to matter for the compilers that we tested, but we never tested static builds (which is where this problem seems to occur, at least with the Intel compiler: the linker compilains that the variable in the common block in the user's .o file was of one size/alignment but the one in the C library was a different size/alignment). So this patch fixes the sizes/types of the Fortran common block variables and their corresponding C instantiations to be of the same sizes/types. But wait, there's more. We recently introduced a fix for the OSX linker where some C versions of the fortran common block variables (e.g., _ompi_fortran_status_ignore) were not being found when linking ompi_info (!). Further research shows that the code path for ompi_info to require ompi_fortran_status_ignore is, unfortunately, necessary (a quirk of how various components pull in different portions of the code base -- nothing in ompi_info itself requires fortran or MPI knowledge, of course). Hence, the real problem was that there was no code path from ompi_info to the portion of the code base where the C globals corresponding to the Fortran common block variables were instantiated. This is because the OSX linker does not automatically pull in .o files that only contain unintialized global variables; the OSX linker typically only pulls in a .o file from a library if it either has a function that is used or have a global variable that is initialized (that's the short version; lots of details and corner cases omitted). Hence, we changed the global C variables corresponding to the fortran common blocks to be initialized, thereby causing the OSX linker to pull them in automatically -- problem solved. At the same time, we moved the constants to another .c file with a function, just for good measure. However, this didn't really solve the problem: 1. The function in the file with the C versions of the fortran common block variables (ompi/mpi/f77/test_constants_f.c) did not have a code path that was reachable from ompi_info, so the only reason that the constants were found (on OSX) was because they were initialized in the global scope (i.e., causing the OSX compiler to pull in that .o file). 2. Initializing these variable in the global scope causes problems for some linkers where -- once all the size/type problems mentioned above were fixed -- the alignments of fortran common blocks and C global variables do not match (even though the types of the Fortran and C variables match -- wow!). Hence, initializing the C variables would not necessarily match the alignment of what Fortran expected, and the linker would issue a warning (i.e., the alignment warnings referenced in the original post). The solution is two-fold: 1. Move the Fortran variables from test_constants_f.c to ompi/mpi/runtime/ompi_mpi_init.c where there are other global constants that are initialized (that had nothing to do with fortran, so the alignment issues described above are not a factor), and therefore all linkers (including the OSX linker) will pull in this .o file and find all the symbols that it needs. 2. Do not initialize the C variables corresponding to the Fortran common blocks in the global scope. Indeed, never initialize them at all (because we never need their values - we only check for their locations). Since nothing is ever written to these variables (particularly in the global scope), the linker does not see any alignment differences during initialization, but does make both the C and Fortran variables have the same addresses (this method has been working in LAM/MPI for over a decade). There were some comments here in the OMPI code base and in the LAM code base that stated/implied that C variables corresponding to Fortran common blocks had to have the same alignment as the Fortran common blocks (i.e., 16). There were attempts in both code bases to ensure that this was true. However, the attempts were wrong (in both code bases), and I have now read enough Fortran compiler documentation to convince myself that matching alignments is not required (indeed, it's beyond our control). As long as C variables corresponding to Fortran common blocks are not initialized in the global scope, the linker will "figure it out" and adjust the alignment to whatever is required (i.e., the greater of the alignments). Specifically (to counter comments that no longer exist in the OMPI code base but still exist in the LAM code base): - there is no need to make attempts to specially align C variables corresponding to Fortran common blocks - the types and sizes of C variables corresponding to Fortran common blocks should match, but do not need to be on any particular alignment Finally, as a side effect of this effort, I found a bunch of inconsistencies with the intent of status/array_of_statuses parameters. For all the functions that I modified they should be "out" (not inout). This commit was SVN r11057.	2006-07-31 15:07:09 +00:00
Brian Barrett	566a050c23	Next step in the project split, mainly source code re-arranging - move files out of toplevel include/ and etc/, moving it into the sub-projects - rather than including config headers with <project>/include, have them as <project> - require all headers to be included with a project prefix, with the exception of the config headers ({opal,orte,ompi}_config.h mpi.h, and mpif.h) This commit was SVN r8985.	2006-02-12 01:33:29 +00:00
Brian Barrett	b1d2424013	Merge in present work on the MPI-2 onesided chapter. The current code is not complete, but stable enough that it will have no impact on general development, so into the trunk it goes. Changes in this commit include: - Remove the --with option for disabling MPI-2 onesided support. It complicated code, and has no real reason for existing - add a framework osc (OneSided Communication) for encapsulating all the MPI-2 onesided functionality - Modify the MPI interface functions for the MPI-2 onesided chapter to properly call the underlying framework and do the required error checking - Created an osc component pt2pt, which is layered over the BML/BTL for communication (although it also uses the PML for long message transfers). Currently, all support functions, all communication functions (Put, Get, Accumulate), and the Fence synchronization function are implemented. The PWSC active synchronization functions and Lock/Unlock passive synchronization functions are still not implemented This commit was SVN r8836.	2006-01-28 15:38:37 +00:00
Brian Barrett	60ac1cb5f4	print stack traces (when available) for opal and orte processes, as well as ompi processes. Also add SIGABRT to the list of signals that are intercepted to print out pretty messages. This commit was SVN r8672.	2006-01-11 04:36:39 +00:00
Brian Barrett	45a3eccec6	re-enable the stack-trace functionality (it had been inadvertently disabled in a merge a long time ago). Set default signals to print a stack trace from none to SIGBUS,SIGSEGV. This commit was SVN r8603.	2005-12-22 19:39:50 +00:00
George Bosilca	58ba8f033d	Handle the case when there is no F77 support available for us. This commit was SVN r8238.	2005-11-22 21:52:14 +00:00
Jeff Squyres	9812694227	Ensure to instantiate MPI_F_STATUS_IGNORE and MPI_F_STATUSES_IGNORE. Thanks to Anthony Chan for pointing this out. Note that these will only work for the Fortran compiler that Open MPI was configured with -- since these values, are, by definition, single-value, they cannot support all 4 values that Open MPI may generate for the different Fortran name-mangling schemes. A lengthy comment in ompi_mpi_init.c explains this in more detail. Added to the README to explain this situation, as well as the forthcoming .TRUE. Fortran fixes. This commit was SVN r8231.	2005-11-22 15:24:39 +00:00
George Bosilca	932c67aeb3	MPI_COMM_WORLD should be the first communicator who get created even before MPI_COMM_SELF and MPI_COMM_NULL. This commit was SVN r8132.	2005-11-12 03:47:17 +00:00
Jeff Squyres	42ec26e640	Update the copyright notices for IU and UTK. This commit was SVN r7999.	2005-11-05 19:57:48 +00:00
Edgar Gabriel	3633605010	moving op_init further up in ompi_mpi_init, since it is required when quering some of the collective components. Up to now, it just worked somehow, but now with correct reference counting for ops in place, it refused :-) This commit was SVN r7866.	2005-10-25 18:33:48 +00:00
Jeff Squyres	a459659a33	Print the string name of the return code This commit was SVN r7789.	2005-10-17 20:47:44 +00:00
David Daniel	e4985c2a07	Moving totalview spin to the very end of mpi_init This commit was SVN r7444.	2005-09-20 15:22:15 +00:00
Galen Shipman	d932cfd342	merge of rcache work into the trunk.. lotsa fun ;-).. I regression tested before the merge, I will regression test tonight and correct issues that might have crept in. This commit was SVN r7329.	2005-09-12 22:28:23 +00:00
George Bosilca	948683215b	And more fixes ... This commit was SVN r7321.	2005-09-12 20:25:01 +00:00
Ralph Castain	96f4bb7a63	Hey, sports fans!! Guess what?? Here's the huge registry check-in you've all been waiting for with baited breath. The revised version sends a single message to all processes at the various stage gates, thus making the startup much more scalable. I could provide you with all the tawdry details, but won't for now - you are welcome to ask, though, and I'll merrily bore your ears to tears. In addition, the commit contains the following: 1. set the ignore properties on ompi/debuggers and orte/mca/pls/poe 2. Added simplified subscribe and put functions to the registry's API. I have also converted all of the ompi functions that registered subscriptions to the new API, and caught their associated put's as well. In a follow-on commit, I'll be adding support for George's hetero arch registry subscription (wanted to get this one in first). This commit was SVN r7118.	2005-09-01 01:07:30 +00:00
David Daniel	227947fc51	Moving MPI-side TotalView support into a separate directory ompi/debuggers/ so that compilation options can be more easily controlled. This commit was SVN r7113.	2005-08-31 20:35:15 +00:00
David Daniel	995641c1e6	Don't initialize proctable more than once (since the stage gate 1 trigger seems to get fired at least twice). This commit was SVN r7101.	2005-08-31 00:21:55 +00:00
David Daniel	a5d9199e7f	Adding a simple hook for TotalView that is activated if a particular MCA parameter is set. orterun/MPI integration still not quite working. This commit was SVN r7097.	2005-08-30 17:34:23 +00:00
Jeff Squyres	c9cdb36b0b	Finally get this right: move orte_sys_info.[ch] back into the orte tree. - fix up #include's throughout the tree (yay contrib/search_replace.pl!) - remove a few extraneous #include's - remove orte_sys_info*() from opal_init()/opal_finalize() (it's already in orte_init_stage1() and orte_system_finalize()) - remove dependencies in opal on orte_system_info -- util/os_path.c and util/os_create_dirpath.c (they only used path_sep, anyway -- easily changed to #defines) This commit was SVN r7059.	2005-08-26 21:03:41 +00:00
Josh Hursey	4eefb33182	Some param changes: - Change orte_base_infrastructre to orte_infrastructre to conform with ompi_info's needs - Move MCA Param registration in ORTE to a centralized function that is called first in orte_init_stage1 - Set the infrastructre flag as an argument to orte_init - Adjust initalization functions to properly pass down the infrastructre flag. This commit was SVN r7053.	2005-08-26 20:13:35 +00:00

1 2

66 Коммитов