openmpi

Автор	SHA1	Сообщение	Дата
Ralph Castain	4fff584a68	Commit the orted-failed-to-start code. This correctly causes the system to detect the failure of an orted to start and allows the system to terminate all procs/orteds that did start. The primary change that underlies all this is in the OOB. Specifically, the problem in the code until now has been that the OOB attempts to resolve an address when we call the "send" to an unknown recipient. The OOB would then wait forever if that recipient never actually started (and hence, never reported back its OOB contact info). In the case of an orted that failed to start, we would correctly detect that the orted hadn't started, but then we would attempt to order all orteds (including the one that failed to start) to die. This would cause the OOB to "hang" the system. Unfortunately, revising how the OOB resolves addresses introduced a number of additional problems. Specifically, and most troublesome, was the fact that comm_spawn involved the immediate transmission of the rendezvous point from parent-to-child after the child was spawned. The current code used the OOB address resolution as a "barrier" - basically, the parent would attempt to send the info to the child, and then "hold" there until the child's contact info had arrived (meaning the child had started) and the send could be completed. Note that this also caused comm_spawn to "hang" the entire system if the child never started... The app-failed-to-start helped improve that behavior - this code provides additional relief. With this change, the OOB will return an ADDRESSEE_UNKNOWN error if you attempt to send to a recipient whose contact info isn't already in the OOB's hash tables. To resolve comm_spawn issues, we also now force the cross-sharing of connection info between parent and child jobs during spawn. Finally, to aid in setting triggers to the right values, we introduce the "arith" API for the GPR. This function allows you to atomically change the value in a registry location (either divide, multiply, add, or subtract) by the provided operand. It is equivalent to first fetching the value using a "get", then modifying it, and then putting the result back into the registry via a "put". This commit was SVN r14711.	2007-05-21 18:31:28 +00:00
Tim Prins	2ffc02870d	Reduce the memory usage of the GPR: - Make it so that all the GPR pointer arrays are allocated initially at 16 elements instead of 512. This saves (on a 64 bit machine) approximately 4*(# procs + # nodes) KB. - Fix up the segment prealloc function so that preallocating an existant segment is not an error, and make the areas where we do large inserts use it. Fix the orte_pointer_array to efficiently implement setting its size. Before we just realloced the array one block at a time until the desired size was reached. Now we resize it all in one realloc. This commit was SVN r14264.	2007-04-09 00:40:15 +00:00
Ralph Castain	53967bd698	Fix a memory corruption problem deep inside the registry when subscriptions/triggers are processed. The create_value function will malloc space for the pointers to keyval objects, but doesn't actually allocate space for the objects themselves. When constructing the gpr_notify_data object, we forgot to OBJ_NEW the keyval objects. Since the create_value function didn't explicitly NULL those memory locations, it just so happened that there was a non-NULL address in them....which we dutifully dumped a keyval into. This fix includes two parts: (a) we now initialize the keyval pointer locations to NULL after the malloc, and (b) we now OBJ_NEW the keyvals prior to storing info in them. BTW, in case anyone reads this and wonders why we don't just OBJ_NEW the keyvals in create_value, the reason is simply that some places in the code use static keyvals and simply assign those addresses into the value object's array. So not everyone wants to OBJ_NEW keyvals - by not forcing it here in create_value, we give the user the flexibility to do whatever they want. This commit was SVN r13300.	2007-01-25 12:54:02 +00:00
Ralph Castain	0a5d41857a	Complete next round of message size reduction: "strip" the descriptive info from the returned values. I have now added a flag to the gpr address mode (ORTE_GPR_STRIPPED) that instructs the gpr to not include segment names or tokens in the returned gpr_value_t objects. I found only two places that were looking at the tokens: 1. the odls - we used the tokens to separately process the globals container data from everything else. In this case, I left the subscription that returned the globals data alone, but "stripped" the subscription that returned the launch data for the procs. These subscriptions have nothing to do with the xcast message. 2. the pml_base_modex - the callback function was getting process names from the returned tokens. Actually, this function was doing a very bad thing - it was assuming that the first token returned was always the process name. This is currently true, but is one of those assumptions that someone could have easily changed - and suddenly found the system inexplicably failing. I modified the function to (a) get the name sent back to us, (b) "stripped" the value structures of tokens and segment strings, and (c) correctly obtained process names from the returned values. I also reindented the heck out of the code so it was legible (at least, to my old eyes). This commit was SVN r12813.	2006-12-09 23:10:25 +00:00
George Bosilca	3fd278c522	Make the tree compile in debug mode. This commit was SVN r12724.	2006-12-01 23:03:09 +00:00
Ralph Castain	897744cdeb	Two major changes to the runtime: 1. implement and enable the non-described buffer operations. I will send out a more detailed explanation separately. However, this mode of operation (which is now the default) significantly reduces message size during startup. If you want the described buffers, set the mca param "-mca dss_describe_buffer 1". 2. revise the xcast system to support both linear and binomial tree broadcast methods. Since we are seeing scenarios where the binomiall tree can cause problems, I have made the linear method the default. To run with the binomial tree, set the mca param "-mca oob_xcast_mode binomial". 3. add some detailed timing reports to the xcast operation. These are enabled via "-mca oob_xcast_timing 1". 4. add some more unit tests for the dss and gpr (focused on support for the non-described buffer) This commit was SVN r12722.	2006-12-01 22:30:39 +00:00
George Bosilca	2aa3e51223	Nothing relevant. Only a set of castings to have a clean compile on Windows. The cl.exe compiler is pretty good at complaining about any kind of non explicit cast. This commit was SVN r12207.	2006-10-20 02:25:50 +00:00
Ralph Castain	13227e36ab	This commit looks a lot bigger than it is, so relax :-) Fix the problem observed by multiple people that comm_spawned children were (once again) being mapped onto the same nodes as their parents. This was caused by going through the RAS a second time, thus overwriting the mapper's bookkeeping that told RMAPS where it had left off. To solve this - and to continue moving forward on the ORTE development - we introduce the concept of attributes to control the behavior of the RM frameworks. I defined the attributes and a list of attributes as new ORTE data types to make it easier for people to pass them around (since they are now fundamental to the system, and therefore we will be packing and unpacking them frequently). Thus, all the functions to manipulate attributes can be implemented and debugged in one place. I used those capabilities in two places: 1. Added an attribute list to the rmgr.spawn interface. 2. Added an attribute list to the ras.allocate interface. At the moment, the only attribute I modified the various RAS components to recognize is the USE_PARENT_ALLOCATION one (as defined in rmgr_types.h). So the RAS components now know how to reuse an allocation. I have debugged this under rsh, but it now needs to be tested on a wider set of platforms. This commit was SVN r12138.	2006-10-17 16:06:17 +00:00
George Bosilca	f52c10d18e	And ORTE is ready for prime-time. All Windows tricks are in: - use the OPAL functions for PATH and environment variables - make all headers C++ friendly - no unamed structures - no implicit cast. Plus a full implementation for the orte_wait functions. This commit was SVN r11347.	2006-08-23 03:32:36 +00:00
Ralph Castain	c3ba1c1cc1	Fix a pack/unpack mismatch This commit was SVN r11315.	2006-08-22 13:50:59 +00:00
George Bosilca	6afa4c6c64	Windows friendly version. We have to split the OMPI_DECLSPEC in at least 3 different macros, one for each project. Therefore, now we have OPAL_DECLSPEC, ORTE_DECLSPEC and OMPI_DECLSPEC. Please use them based on the sub-project. This commit was SVN r11270.	2006-08-20 15:54:04 +00:00
Ralph Castain	5dfd54c778	With the branch to 1.2 made.... Clean up the remainder of the size_t references in the runtime itself. Convert to orte_std_cntr_t wherever it makes sense (only avoid those places where the actual memory size is referenced). Remove the obsolete oob barrier function (we actually obsoleted it a long time ago - just never bothered to clean it up). I have done my best to go through all the components and catch everything, even if I couldn't test compile them since I wasn't on that type of system. Still, I cannot guarantee that problems won't show up when you test this on specific systems. Usually, these will just show as "warning: comparison between signed and unsigned" notes which are easily fixed (just change a size_t to orte_std_cntr_t). In some places, people didn't use size_t, but instead used some other variant (e.g., I found several places with uint32_t). I tried to catch all of them, but... Once we get all the instances caught and fixed, this should once and for all resolve many of the heterogeneity problems. This commit was SVN r11204.	2006-08-15 19:54:10 +00:00
Brian Barrett	e737b0a106	Fix a bunch of warnings the Sun compilers find: - The constant 1 is a signed int by default. Explicitly say that it is an unsigned value so we can't overflow - Fix unreachable statement warnings in dss_arith by breaking out of switch statements instead of returning - this should have no impact on performance, since it's a non-conditional jump - A couple of the GPR files had carriage returns and were in DOS mode - put them in unix mode... These should all probably go to the v1.1 branch... This commit was SVN r9664.	2006-04-20 15:35:58 +00:00
Ralph Castain	b9bdb2125e	Fix and upgrade the console to support better debugging. Activate "dump" commands to display registry content. Remove the blasted opal_output default prefix that made the dump output illegible. Properly connect to existing daemons and/or start new ones. This commit was SVN r9528.	2006-04-04 11:05:52 +00:00
Brian Barrett	6be35fb604	* Use the ORTE_<type> constants instead of internal DSS_TYPE_<type>_T constants for the type to be packed / unpacked when dealing with sized types (like size_t) so that the dss_unpack code to deal with types of different sizes is activated. Necessary for proper 32/64 interoperability. This commit was SVN r9475.	2006-03-30 14:33:25 +00:00
Brian Barrett	566a050c23	Next step in the project split, mainly source code re-arranging - move files out of toplevel include/ and etc/, moving it into the sub-projects - rather than including config headers with <project>/include, have them as <project> - require all headers to be included with a project prefix, with the exception of the config headers ({opal,orte,ompi}_config.h mpi.h, and mpif.h) This commit was SVN r8985.	2006-02-12 01:33:29 +00:00
Ralph Castain	4b9f015c0b	Merge in the new data support subsystem for ORTE. MPI folks should not notice a difference. Longer explanation will be sent to developers mailing list. This commit was SVN r8912.	2006-02-07 03:32:36 +00:00
Brian Barrett	bc4d3d6fff	IRIX compile fixes: - Need to make sure that SIZE_MAX exists as a constant if stdint.h doesn't exist - struct timeval is defined in unistd.h on IRIX, so need to include that headerfile where ever struct timeval is used. This commit was SVN r8361.	2005-12-01 18:28:20 +00:00
Brian Barrett	8faa1884f0	* The last of the build system optimizations. Combine the component and component/base Makefile.am files, reducing the time configure spends stamping out Makefiles at the end * Install base_impl.h file when devel-headers are being installed This commit was SVN r8200.	2005-11-20 01:03:01 +00:00
Jeff Squyres	42ec26e640	Update the copyright notices for IU and UTK. This commit was SVN r7999.	2005-11-05 19:57:48 +00:00
Jeff Squyres	ce78b76598	Quick fix from Ralph -- this escape committing last night. This commit was SVN r7917.	2005-10-28 14:03:26 +00:00
Ralph Castain	afeeacd76d	Complete hookup of the registry proxy for the get_conditional command. This commit was SVN r7915.	2005-10-28 05:35:07 +00:00
Ralph Castain	eebda71a0b	Add a new API to the registry for conditional data retrievals. The new API allows you to retrieve data from registry containers that have key-value pairs where the value matches the specified one. The requested keys are then retrived from that container. This commit was SVN r7907.	2005-10-28 00:30:58 +00:00
Ralph Castain	6c839048cf	Fix a typo that caused valgrind to bark on 64-bit machines. Actually was a potential source of error, so the barking was legit. This commit was SVN r7677.	2005-10-10 02:34:26 +00:00
Ralph Castain	b589a93e29	Continue to lace the trace functionality into orte... This commit was SVN r7427.	2005-09-19 15:29:14 +00:00
Brian Barrett	ed56e743b7	* update configure.ac to use the modern version of AC_INIT and AM_INIT_AUTOMAKE, instead of the deprecated version. * Work around dumbness in modern AC_INIT that requires the version number to be set at autoconf time (instead of at configure time, as it was before). Set the version number, minus the subversion r number, at autoconf time. Override the internal variables to include the r number (if needed) at configure time. Basically, the right thing should always happen. The only place it might not is the version reported as part of configure --help will not have an r number. * Since AM_INIT_AUTOMAKE taks a list of options, no need to specify them in all the Makefile.am files. * Addes support for subdir-objects, meaning that object files are put in the directory containing source files, even if the Makefile.am is in another directory. This should start making it feasible to reduce the number of Makefile.am files we have in the tree, which will greatly reduce the time to run autogen and configure. This commit was SVN r7211.	2005-09-07 05:54:53 +00:00
Rainer Keller	a36347d728	- Support -prefix specification on mpirun/orterun cmd-line per app_context: mpirun -np 2 -prefix /path/to/ompi/on/machineA ./exec1 : \ -np 2 -prefix /path/to/ompi/on/machineB ./exec2 - Allow with -mca pls_rsh_assume_same_shell 0, the checking for the SHELL-variable on the actual node (currently 1st node). Sets the prefix, PATH and LD_LIBRARY_PATH for bash/ksh and csh/tcsh. This commit was SVN r7195.	2005-09-06 16:10:05 +00:00
Ralph Castain	03e45e6723	Two quick additions: 1. Added OMPI_PROC_ARCH as a defined registry key and added the code so that the architecture info gets properly transmitted across all processes using the startup message. 2. Added an OMPI_MODEX_KEY definition and removed the hard-coded "modex" key from pml_modex_exchange This commit was SVN r7129.	2005-09-01 15:05:03 +00:00
Jeff Squyres	3962c53e2e	- Add to AM_CPPFLAGS $(OPAL_LTDL_CPPFLAGS) where necessary in order to add a -I to find the included ltdl.h (vs. a system-installed ltdl.h) - Clean up kruft in a bunch of Makefile.am's to remove now-unnecessary AM_CPPFLAGS settings to get static-components.h for each framework - Move the component_repository API functions out of opal/mca/base/base.h and into opal/mca/base/mca_base_component_repository.h in order to decrease unnecessary dependencies (e.g., before this, almost everything in the tree depended on ltdl.h, which is unnecessary -- only a small number of files really need ltdl.h) This commit was SVN r7127.	2005-09-01 12:16:36 +00:00
Ralph Castain	96f4bb7a63	Hey, sports fans!! Guess what?? Here's the huge registry check-in you've all been waiting for with baited breath. The revised version sends a single message to all processes at the various stage gates, thus making the startup much more scalable. I could provide you with all the tawdry details, but won't for now - you are welcome to ask, though, and I'll merrily bore your ears to tears. In addition, the commit contains the following: 1. set the ignore properties on ompi/debuggers and orte/mca/pls/poe 2. Added simplified subscribe and put functions to the registry's API. I have also converted all of the ompi functions that registered subscriptions to the new API, and caught their associated put's as well. In a follow-on commit, I'll be adding support for George's hetero arch registry subscription (wanted to get this one in first). This commit was SVN r7118.	2005-09-01 01:07:30 +00:00
Jeff Squyres	cce0950df7	- change a bunch of OMPI_* constants or ORTE_* equivalents - change the framework opens to [mostly] use the new MCA param API - properly pass in framework debug output streams to the mca_base_component_open() function This commit was SVN r6888.	2005-08-15 18:25:35 +00:00
Ralph Castain	4e1837687b	Finish simplified interfaces for put and subscribe - more details to come. This commit was SVN r6713.	2005-08-02 19:43:29 +00:00
Ralph Castain	8c6c78c47a	Add a few new functions that were requested last week - not tested yet, so please don't use them! I will test them this afternoon on a different computer. For now, they won't cause any problems since they aren't being called. This commit was SVN r6689.	2005-08-01 16:38:15 +00:00
Ralph Castain	4e79a51395	Add a job_info segment to the system that holds a container for each job. Within each container is a keyval indicating the job state (i.e., all procs at stage1, finalized, etc.). This provides a rough state-of-health for the job. This required a little fiddling with a number of areas. Biggest problem was that it uncovered a potential for an infinite loop to be created in the registry. If a callback function modified the registry, the registry checked the triggers to see if anything had fired. Well, if the original callback was due to a trigger firing, that condition hadn't changed - so the trigger fired again....which caused the callback to be called, which modified the registry, which checked the triggers, etc. etc. Triggers are now checked and then "flagged" as being "in process" so that the registry will NOT recheck that trigger until all callbacks have been processed. Tried doing this with subscriptions as well, but that caused a problem - when we release processes from a stagegate, they (at the moment) immediately place data on the registry that should cause a subscription to fire. Unfortunately, the system will just hang if that subscription doesn't get processed. So, I have left the subscription system alone - any callback function that modifies the registry in a fashion that will fire a subscription will indeed fire that subscription. We'll have to see if this causes problems - it shouldn't, but a careless user could lock things up if the callback generates a callback to itself. Also fixed the code that placed a process' RML contact info on the registry to eliminate the leading '/' from the string. This commit was SVN r6684.	2005-07-29 14:11:19 +00:00
George Bosilca	9fdfbd9934	correct the printf for 64 bits architectures. This commit was SVN r6667.	2005-07-28 19:54:06 +00:00
Ralph Castain	f604fb72db	Turn "on" the delete functionality for the registry. Should now be able to delete entries and segments, and get an index of the dictionary entries on the registry. Haven't fully tested these yet (nobody is using them at the moment that I know of - good thing, since they haven't been working for a long time - though I know the MPI-2 stuff needs the functionality), but will do so shortly. For now, they compile. This commit was SVN r6567.	2005-07-20 18:07:46 +00:00
Ralph Castain	19d58ee17e	First phase of the scalable RTE changes: 1. Modify the registry to eliminate redundant data copying for startup messages. 2. Revise the subscription/trigger system to avoid redundant storage of triggers and subscriptions. This dramatically reduces the search time when a registry action occurs - to illustrate the point, there are now only a handful of triggers on the system for each job. Before, there were a handful of triggers for each PROCESS in the job, all of which had to be checked every time something happened on the registry. This is much, much faster now. 3. Update all subscriptions to the new format. There are now "named" subscriptions - this allows you to "name" a subscription that all the processes will be using. The first one to hit the registry actually defines the subscription. From then on, any subsequent "subscribes" to the same name just cause that process to "attach" to the existing subscription. This keeps the number of subscriptions being tracked by the registry to a minimum, while ensuring that each process still gets notified. 4. Do the same for triggers. Also fixed a duplicate subscription problem that was causing people to receive data equal to the number of processes times the data they should have received from a trigger/subscription. Sorry about that... :-( ...but it's all better now! Uncovered a situation where the modex data seems to be getting entered on the registry a second time - the latter time coming after the compound command has been "fired", thereby causing all the subscriptions to fire. Asked Tim and Jeff to look into this. Second phase of the changes will involve modifying the xcast system so that the same message gets sent to all processes. This will further reduce the message traffic, and - once we have a true "broadcast" version of xcast - really speed things up and improve scalability. This commit was SVN r6542.	2005-07-18 18:49:00 +00:00
Ralph Castain	44ace2f64e	Well, I think this will fix the bug Greg encountered when sending no triggers on a subscription. However, I can't test it since the trunk no longer runs on my Mac notebook - I get an error message "No ptl components available. This shouldn't happen." and the processes exit. This commit was SVN r6476.	2005-07-14 01:32:36 +00:00
Brian Barrett	a991d883c1	* Rewrite ompi_mca.m4 to use m4_defined lists of projects (ompi, orte, etc.), frameworks, and components without configure scripts instead of hard-coded shell variables (for projects and frameworks) and shell variable building (for components). * Add 3rd category of component configuration (in addition to configure scripts and no-configured components): configure.m4 components. These components can only be built as part of OMPI (like no-configure), but can provide an m4 file that is run as part of the main configure script. These macros can set whether the component should be built, along with just about any other configuration wanted. More care must be taken compared to configure components, as doing things like setting variables or calling AC_MSG_ERROR now affects the top-level configure script (so calling AC_MSG_ERROR if your component can't configure probably isn't what you want) * Added support to autogen.sh for the configure.m4-style components, as well as building up the m4_define lists ompi_mca.m4 now expects * Updated a number of macros to be more config.cache friendly (both so that config.cache can be used and so the test can be quickly run multiple times in the same configrue script): - ompi_config_asm - c_weak_symbols - c_get_alignment * Added new macros to be shared when configuring components: - ompi_objc.m4 (this actually provides AC_PROG_OBJC - don't ask...) - ompi_check_xgrid - ompi_check_tm - ompi_check_bproc * Updated a number of components to use configure.m4 instead of configure.stub - btl portals - io romio - tm ras and pls - bjs, lsf_bproc ras and bproc_seed pls - xgrid ras and pls - null iof (used by tm) This commit was SVN r6412.	2005-07-09 18:52:53 +00:00
Brian Barrett	0ae16f2ab7	* add local hook to remove static-components.h in distclean target. The files are generated by configure, and not part of the tarball, so distclean would be the right place to remove them. This commit was SVN r6390.	2005-07-08 13:54:12 +00:00
Jeff Squyres	6a9c9953bc	Remove a bunch of -I's that are no longer necessary with properly-prefixed static-component.h files. This commit was SVN r6342.	2005-07-04 18:24:58 +00:00
Brian Barrett	9f44b80291	* rename ompi_argv to opal_argv * rename ompi_basename to opal_basename * rename ompi bitop functions to opal * rename ompi_cmd_line to opal_cmd_line * rename ompi_sizet2int to opal_sizet2int * rename orte_daemon_init to opal_daemon_init * rename ompi_few to opal_few This commit was SVN r6330.	2005-07-04 00:13:44 +00:00
Brian Barrett	a13166b500	* rename ompi_output to opal_output This commit was SVN r6329.	2005-07-03 23:31:27 +00:00
Brian Barrett	39dbeeedfb	* rename locking code from ompi to opal This commit was SVN r6327.	2005-07-03 22:45:48 +00:00
Brian Barrett	761402f95f	* rename ompi_list to opal_list This commit was SVN r6322.	2005-07-03 16:22:16 +00:00
Brian Barrett	499e4de1e7	* rename ompi_object and ompi_class to opal_object and opal_class This commit was SVN r6321.	2005-07-03 16:06:07 +00:00
Jeff Squyres	282a8b5e8d	More orte Makefile.am updates This commit was SVN r6287.	2005-07-02 15:13:41 +00:00
Jeff Squyres	1b18979f79	Initial population of orte tree This commit was SVN r6266.	2005-07-02 13:42:54 +00:00

48 Коммитов