1
1
Граф коммитов

11355 Коммитов

Автор SHA1 Сообщение Дата
Lenny Verkhovsky
fa6a084d33 added opal/mca/paffinity/base/paffinity_base_service.c with paffinity functions
This commit was SVN r18020.
2008-03-30 12:01:02 +00:00
Lenny Verkhovsky
7e45d7e134 Few updates due to RMAPS rank_file component changes
1. applied prefix rule to functions and variables of RMAPS rank_file component
2. cleaned ompi_mpi_init.c from paffinity code
3. paffinity code moved to new opal/mca/paffinity/base/paffinity_base_service.c file
4. added opal_paffinity_slot_list mca parameter

This commit was SVN r18019.
2008-03-30 11:52:11 +00:00
Lenny Verkhovsky
cb83a1287d Realy deleted old files now
This commit was SVN r18018.
2008-03-30 11:50:19 +00:00
Lenny Verkhovsky
f734ba51a4 Added files with names according to prefix rule
This commit was SVN r18017.
2008-03-30 11:42:09 +00:00
Lenny Verkhovsky
b43f4a2dc9 Deleted and added files after prefix rule changes
This commit was SVN r18016.
2008-03-30 11:41:01 +00:00
Jeff Squyres
8d79bfe860 Fix for CID 937. All we really care about is being able to chrdir;
the extra checks were unnecessary.

This commit was SVN r18015.
2008-03-29 13:15:22 +00:00
Jeff Squyres
d0f12f3df0 Make a better error message.
This commit was SVN r18014.
2008-03-29 12:54:24 +00:00
Rich Graham
3b42d2268d add functions to handle two different input buffers and a separate
output buffer.  User defined data types have not way to make use
of these.

This commit was SVN r18012.
2008-03-28 23:45:44 +00:00
Shiqing Fan
f82092566f We don't have inttypes.h on Windows, and some types are redefined.
This commit was SVN r18010.
2008-03-28 17:33:54 +00:00
Shiqing Fan
aaf2730fab Winsock2.h also has definition for timeval and so on, it conflicts with our own definitions.
This commit was SVN r18009.
2008-03-28 17:30:33 +00:00
Rich Graham
90e53ca9ee debug the pipeline algorithm.
This commit was SVN r18008.
2008-03-28 15:10:07 +00:00
Aurelien Bouteiller
77653ac787 Missing .h file in makefile breaked nightly tarball distcheck...
This commit was SVN r18006.
2008-03-28 14:36:56 +00:00
Ralph Castain
6fcaa8df39 Remove stale define. Add global variable to be used soon.
This commit was SVN r18005.
2008-03-28 02:20:37 +00:00
Jeff Squyres
6ea36061cf Fix typo found by Pak.
This commit was SVN r18000.
2008-03-27 23:04:17 +00:00
Aurelien Bouteiller
c16339944a Fix a coverity warning about using unsafe sprintf.
This commit was SVN r17999.
2008-03-27 21:24:27 +00:00
Aurelien Bouteiller
e11237aadb Introduction of the "progress" sender_based method to replace the slow isend-self method.
This commit was SVN r17998.
2008-03-27 21:19:45 +00:00
Aurelien Bouteiller
93db01871e This is part of the previous patch.
This commit was SVN r17997.
2008-03-27 21:06:14 +00:00
Aurelien Bouteiller
f8bf6f2c6a Code cleanup.
sender_based.h is now split in two files, to solve cyclic .h files inclusion. 
Most macros are now inline functions.
Variable names have been changed from places to places.
Various other small things... 

This commit was SVN r17996.
2008-03-27 21:05:44 +00:00
George Bosilca
691806680a I guess this wasn't really intended ...
This commit was SVN r17995.
2008-03-27 18:41:06 +00:00
George Bosilca
303941f642 Avoid a deadlock. The comment explain how this might happen.
This commit was SVN r17994.
2008-03-27 18:37:11 +00:00
George Bosilca
be4b153f0d Another patch for thread safety in the TCP BTL (thanks to Pierre).
This commit was SVN r17993.
2008-03-27 18:36:08 +00:00
Ralph Castain
9f1001a6f8 Ensure that the procs know how many daemons will be participating in collective operations.
This commit was SVN r17992.
2008-03-27 17:31:54 +00:00
Tim Prins
c5736e3f9a Remove old constants used with the registry.
This commit was SVN r17991.
2008-03-27 17:13:20 +00:00
Jeff Squyres
c06f7c3992 Fixes trac:1254: ensure that evport.c is in the distribution tarball.
This commit was SVN r17989.

The following Trac tickets were found above:
  Ticket 1254 --> https://svn.open-mpi.org/trac/ompi/ticket/1254
2008-03-27 16:40:55 +00:00
Ralph Castain
6166278e18 Improve the scalability of the modex operation and fix a bug reported by Tim P
The bug was a race condition in the barrier operation that caused the barrier in MPI_Finalize to fail on very short programs.

Scalaiblity was improved by using the daemons to aggregate modex and barrier messages before sending them to the rank=0 proc. Improvement is proportional to ppn, of course, but there really wasn't a scaling problem at low ppn anyway. This modification also paves the way for better allgather operations since now all the data for each node is sitting at the daemon level, and the daemons are now aware that a collective operation on the OOB is underway (so they -can- participate in a collective of their own to support it).

Also added better diagnostics to map out the timing associated with MPI_Init - turned on by -mca orte_timing 1.

This commit was SVN r17988.
2008-03-27 15:17:53 +00:00
Gleb Natapov
cf40674369 Decide if sends should be throttled at the receiver and pass this to the sender
in an ACK message. The decision can't be done reliably at the sender.

This commit was SVN r17987.
2008-03-27 08:56:43 +00:00
Rich Graham
e2ad9c4be2 adjust to change in orte_process_info.
This commit was SVN r17986.
2008-03-27 01:25:28 +00:00
Rich Graham
441fb9fb9e checkpoint.
This commit was SVN r17985.
2008-03-27 01:16:32 +00:00
Ralph Castain
8e6da2ee76 Maintain the mapping bookmark across multiple comm_spawns
This commit was SVN r17984.
2008-03-27 00:19:13 +00:00
George Bosilca
8e8b8950ef Add support for Interix.
This commit was SVN r17983.
2008-03-26 23:20:33 +00:00
Ralph Castain
abfb3577c1 Ensure that the bookmark of the parent job is applied to the child in a comm_spawn so we start mapping from the right place
This commit was SVN r17982.
2008-03-26 21:18:16 +00:00
Jeff Squyres
a2795fe43d Very minor modification against r17980: check the whole string against
"all", not just the first 3 chars (i.e., if someone sets the value
"allfoo", we should still error).

This commit was SVN r17981.

The following SVN revision numbers were found above:
  r17980 --> open-mpi/ompi@b3ef774d46
2008-03-26 19:10:02 +00:00
Josh Hursey
b3ef774d46 A fix for r17956.
r17956 broke the ability for the user to override the 'opal_event_include'
parameter. This commit checks to see if the user specified a value before
forcing the "all" value on the event engine.

This commit fixes Checkpoint/Restart support in the trunk which requires
this feature.

This commit was SVN r17980.

The following SVN revision numbers were found above:
  r17956 --> open-mpi/ompi@763218e754
2008-03-26 14:54:09 +00:00
Sharon Melamed
afa98f92e8 Changed the for loop to a while loop so I could
release the edge without conflicting with get next.

This commit was SVN r17979.
2008-03-26 14:45:45 +00:00
Josh Hursey
55044c3c4f A fix from resulting from r17944. Need to make sure we go through
orte_proc_info_finalize properly so the 'init' flag is set on restart.

This is a bit cleaner anyway, esp since the GPR is gone.

This commit was SVN r17978.

The following SVN revision numbers were found above:
  r17944 --> open-mpi/ompi@ec76fe4fe4
2008-03-26 14:13:33 +00:00
Ralph Castain
7ad6db207c Cover some timing-related output
This commit was SVN r17977.
2008-03-26 12:54:50 +00:00
Rainer Keller
ce8154eb3e - Coverity issues CID 945:
Event uninit_use: Using uninitialized value "rc"
   Instead of initializing rc in the beginning, rather use return value
   of opal_hash_table_set_value_uint32.

This commit was SVN r17976.
2008-03-26 11:39:25 +00:00
Jeff Squyres
33c09b30c2 Patch from George: ensure that we don't overwrite timer_linux_happy
improperly when checking the host type.

This commit was SVN r17975.
2008-03-26 11:22:57 +00:00
Rainer Keller
b7efc2b18e - Coverity issues CID 42:
Event var_deref_model: Variable "array_of_integers" tracked as NULL was
   passed to a function that dereferences it. [model]
   The arrays passed down type_get_contents may be NULL, only iff max_* is 0...
   If the max_* parameter does not fit, an error is returned, anyhow.
   One could improve the checks of MPI_PARAM_CHECK, but to be on the
   safe side, fix in dt_args.c.

This commit was SVN r17974.
2008-03-26 09:07:06 +00:00
Rainer Keller
334b64e760 - Coverity issue CID 35:
Event var_deref_op: Variable "requests" tracked as NULL was
   dereferenced.
   Only check requests[i] for NULL, if requests is != NULL itself.

This commit was SVN r17973.
2008-03-26 08:19:55 +00:00
Rainer Keller
56f3d59f2a - Coverity issues 939, 940, 941:
Event uninit_use_in_call: Using uninitialized value "tag" in call to
   function "(ompi_dpm).connect_accept" and others
   The tag is set and used in get_rport only on root...

This commit was SVN r17972.
2008-03-26 08:09:11 +00:00
George Bosilca
4a5431ef11 Remove the event-config.h file, it is never used.
Correct the include logic that protect the headers. It's amazing
that this didn't bite us yet ...

This commit was SVN r17971.
2008-03-26 03:33:43 +00:00
Brad Benton
0b84dfd2a6 POE is not currently working or supported, so removing from the trunk.
This commit was SVN r17970.
2008-03-26 02:06:40 +00:00
Ralph Castain
60d931217f Modify the routed framework to allow greater control/flexibility over response to lost routes and initial wireup of jobs as required by several soon-to-come new modules.
Specifically, add two new APIs:

1. lost_route: allows the OOB to report that a connection has failed, thereby giving the routed module an opportunity to respond appropriately to its topology. Creating the API also allows each routed component to hold its own definition of "lifeline" - in some cases, this may be a single connection, but in others it may be multiple connections. Some modules may choose to re-route messaging if the lifeline or any other connection is lost, while others may choose to abort the job.

Both the tree and unity modules retain the current behavior and abort the job if the lifeline connection is lost, while ignoring other lost connections.

2. get_wireup_info: returns (in a provided buffer) info required to wireup connections for the specified job. Some routed modules do not need to return any info as they can wireup via alternative means, while some need to xchg data with their peers. If info is inserted into the buffer, the plm_base_launch_apps function will xcast the contents to the specified job.

The commit also removes the "lifeline" entry from the orte_process_info struct (and the associated ORTE_PROC_MY_LIFELINE definition) as the lifeline info is now contained within the respective routed module.

This commit was SVN r17969.
2008-03-26 01:00:24 +00:00
George Bosilca
64bc580c78 Use evutil_timercmp instead of timercmp to take advantage of the
fallback installed in evutil.h.

This commit was SVN r17968.
2008-03-25 23:54:30 +00:00
George Bosilca
a01f3f762c Check if extra is NULL or not ...
This commit was SVN r17967.
2008-03-25 22:43:46 +00:00
George Bosilca
bea5c0f734 Don't allocate anything if we don't really need it, and avoid leaking memory.
This commit was SVN r17966.
2008-03-25 22:43:11 +00:00
George Bosilca
2ed6ed37bd Don't forget to cleanup once we're done.
This commit was SVN r17965.
2008-03-25 22:42:24 +00:00
George Bosilca
ac6121bd1c Remove unused variable.
This commit was SVN r17964.
2008-03-25 22:41:50 +00:00
George Bosilca
fe2636cb4a Coverty fix: use snprintf instead of sprintf.
This commit was SVN r17963.
2008-03-25 22:41:25 +00:00