Jeff Squyres
c40740947f
Fix minor spelling error.
...
This commit was SVN r18229.
2008-04-22 13:11:50 +00:00
Galen Shipman
27c425b304
make portals level ack's optional (require ACK by default)
...
This commit was SVN r18228.
2008-04-21 22:22:18 +00:00
Ralph Castain
c3ddf66445
Move the dislay-allocation code to where it is always seen
...
This commit was SVN r18227.
2008-04-21 20:28:59 +00:00
Ralph Castain
16c9100633
Add --display-allocation option to orterun that will display the node-by-node information regarding your allocation.
...
This commit was SVN r18216.
2008-04-20 02:25:45 +00:00
Rich Graham
df35223603
add selection logic for barrier and reduce.
...
This commit was SVN r18215.
2008-04-19 22:40:04 +00:00
Rich Graham
bee8b42f29
remove debug code that would not let people run.
...
Add infrastructure for blocking-barrier.
This commit was SVN r18214.
2008-04-19 01:34:04 +00:00
Josh Hursey
56a61bfacf
switch the name of orterun to mpirun to make things more clear.
...
This commit was SVN r18208.
2008-04-18 12:59:23 +00:00
Galen Shipman
92e3b8671f
nasty memory bug...
...
This commit was SVN r18207.
2008-04-18 03:01:53 +00:00
Jeff Squyres
db2695ccab
Make the symbols be visible.
...
This commit was SVN r18201.
2008-04-18 00:26:17 +00:00
Jeff Squyres
a198971fa2
Temporarily disable Solaris ports support in libevent. Refs trac:1273
...
This commit was SVN r18199.
The following Trac tickets were found above:
Ticket 1273 --> https://svn.open-mpi.org/trac/ompi/ticket/1273
2008-04-17 23:14:43 +00:00
Ralph Castain
fa082cafa9
Shift the architecture calculation from the ompi/datatype engine to the opal/util area. This allows us to compute the architecture earlier in the launch and communicate it outside of the modex.
...
Note: this is an early preliminary step in the movement of portions of the datatype engine to the opal layer.
This commit was SVN r18198.
2008-04-17 20:43:56 +00:00
George Bosilca
01148b77dc
Generate the help message for the available event ops. Now the list only
...
contains the one that are compiled on the current ompi.
This commit was SVN r18196.
2008-04-17 18:16:54 +00:00
Ralph Castain
07f0a71faa
Cleanup the show_help entries on the seq mapper
...
This commit was SVN r18191.
2008-04-17 14:43:15 +00:00
Ralph Castain
e7487ad533
Implement the seq rmaps module that sequentially maps process ranks to a list hosts in a hostfile.
...
Restore the "do-not-launch" functionality so users can test a mapping without launching it.
Add a "do-not-resolve" cmd line flag to mpirun so the opal/util/if.c code does not attempt to resolve network addresses, thus enabling a user to test a hostfile mapping without hanging on network resolve requests.
Add a function to hostfile to generate an ordered list of host names from a hostfile
This commit was SVN r18190.
2008-04-17 13:50:59 +00:00
Tim Prins
eb94fa48ce
the port name is only relevant at the root, so only look at it there.
...
This commit was SVN r18188.
2008-04-17 12:37:10 +00:00
Tim Prins
3582e11200
cleanup some warnings on 32 bit systems
...
This commit was SVN r18187.
2008-04-17 12:25:05 +00:00
Tim Prins
b2acb51d04
make comm_join work again. Allocate memory to the correct pointer.
...
This commit was SVN r18186.
2008-04-17 11:56:53 +00:00
Rich Graham
6c77fa4921
add a blocking shared memory algorithm.
...
This commit was SVN r18185.
2008-04-16 22:10:23 +00:00
Ralph Castain
eb27e4f23d
Move the reissuing of the daemon recv to occur after the message actually gets processed. This ensures that we don't get multiple messages trying to be processed at the same time.
...
Add one more debug output to see where messages are heading
This commit was SVN r18183.
2008-04-16 20:41:00 +00:00
Ralph Castain
66e532669a
Remove some dead code
...
This commit was SVN r18182.
2008-04-16 20:33:53 +00:00
Ralph Castain
3413191e52
Fix singleton and singleton comm_spawn
...
This commit was SVN r18177.
2008-04-16 14:38:10 +00:00
Ralph Castain
7b91f8baff
Cleanup and fix bugs in the MPI dynamics section. Modify the dpm API so it properly takes ports instead of process names (as correctly identified by Aurelien). Fix race conditions in the use of ompi-server. Fix incompatibilities between the mpi bindings and the dpm implemenation that could cause segfaults due to uninitialized memory.
...
Fix the ompi-server -h cmd line option so it actually tells you something!
Add two new testing codes to the orte/test/mpi area: accept and connect.
This commit was SVN r18176.
2008-04-16 14:27:42 +00:00
Shiqing Fan
aa616b9530
Check whether the debugger is running and whether the convertor is valid.
...
Add a loop to skip the DT_LOOP element.
This commit was SVN r18175.
2008-04-16 13:58:58 +00:00
Shiqing Fan
49fbc4e795
These functions should always have a return value.
...
This commit was SVN r18174.
2008-04-16 13:54:15 +00:00
Shiqing Fan
1c4c7e0f2f
Add memchecker support for osc rdma communication.
...
This commit was SVN r18173.
2008-04-16 13:29:55 +00:00
Shiqing Fan
79da2fdd2c
Use the new memchecker convertor function.
...
Remove some unnecessary memchecker calls.
This commit was SVN r18172.
2008-04-16 13:24:35 +00:00
Adrian Knoth
d34dfbe12c
fixed misleading comment.
...
This commit was SVN r18170.
2008-04-16 11:26:15 +00:00
Adrian Knoth
20473bfda2
on incoming connections, compare with every possible source address.
...
Rational (taken from the code):
/* This is PITA. We never know which source address an
* incoming/outgoing packet will have, so even with
* btl_tcp_if_include/exclude on the remote end, we
* might get a different source address.
*
* If this address isn't included in btl_proc->proc_addrs,
* we would erroneously drop the connection
*/
merge -r18165:18167 to the trunk.
This commit was SVN r18169.
The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
r18165
r18167
2008-04-16 11:24:09 +00:00
Adrian Knoth
e981a259bb
btl_tcp_disable_family=4 and btl_tcp_disable_family=6 are mutually
...
exclusive, so this should result in "unreachable" when set differently
between peers.
This commit was SVN r18168.
2008-04-16 10:14:58 +00:00
Adrian Knoth
84e4013530
Always declare oob_tcp_disable_family, no matter if --disable-ipv6 is set.
...
This commit was SVN r18164.
2008-04-16 09:31:15 +00:00
Adrian Knoth
0ddfff4ffe
Added new oob-tcp parameter oob_tcp_disable_family.
...
Like btl_tcp_disable_family, this parameter more or less disables
a whole address family. Though the sockets are still created, the
corresponding information isn't added to the connection strings.
Likewise, we don't try to connect to addresses matching the disabled
address family.
This is particularly important for multidomain clusters, where IPv4 is
oftenly filtered (firewalled), sometimes by simply dropping the packets
instead of rejecting them (thus causing a connection timeout instead of
a quick "no route to host").
This commit was SVN r18163.
2008-04-16 09:22:00 +00:00
Adrian Knoth
75c54616c7
renamed opal_sockaddr2str to opal_net_get_hostname for WANT_PEER_DUMP=1
...
This commit was SVN r18154.
2008-04-15 19:23:47 +00:00
Tim Mattox
55b2546026
Update the NEWS file for a 1.2.7 change.
...
This commit was SVN r18153.
2008-04-15 17:31:57 +00:00
Jeff Squyres
72af302360
Remove unused variable.
...
This commit was SVN r18151.
2008-04-15 14:58:32 +00:00
Ralph Castain
a4ea756a76
Ensure the node loop cntr gets incremented if the daemon already exists
...
This commit was SVN r18150.
2008-04-15 14:20:03 +00:00
Ralph Castain
73e4cfe58a
Add platform files for LANL's RRZ cluster. Update LANL platform files to not build libnbc, vt to save time
...
This commit was SVN r18146.
2008-04-15 02:19:54 +00:00
Ralph Castain
35c260a14f
Fix the plm modules to accommodate the new remote_spawn entry - set that entry to NULL for all but rsh as only that module supports it at this time
...
This commit was SVN r18145.
2008-04-14 19:36:13 +00:00
Ralph Castain
84156c422f
Egad! Typo snuck in there...nasty vi!
...
This commit was SVN r18144.
2008-04-14 18:29:11 +00:00
Ralph Castain
7c7304466c
Add a binomial tree-based launch to ssh, turned "on" only when the plm_rsh_tree_spawned mca param is set to a non-zero value. This probably isn't a very optimized capability, but it does execute a tree-based launch that may scale better than linear at high node counts.
...
Add the daemon map capability to the ODLS to create and save a map of daemon vpid vs nodename from the launch message.
Cleanup a few places in the base plm launch support where we didn't adequately protect rml recv's from potentially executing sends.
This commit was SVN r18143.
2008-04-14 18:26:08 +00:00
Aurelien Bouteiller
0f311ed824
Make sure the function returns NULL when no elan adapter is available instead of a random value.
...
This commit was SVN r18136.
2008-04-11 21:03:01 +00:00
Aurelien Bouteiller
20592cbcbf
Fixes a warning about mallocing 0 bytes when no elan adapter is available.
...
This commit was SVN r18135.
2008-04-11 20:59:12 +00:00
Aurelien Bouteiller
921a6ce3d4
Process with different jobid can kwon connet/accept to each other.
...
This commit was SVN r18134.
2008-04-11 15:40:59 +00:00
Rich Graham
249445d61f
added reduce-scatter followed by gather to root.
...
This commit was SVN r18133.
2008-04-11 13:49:08 +00:00
Rich Graham
a6bdbfab97
implement allreduce as reduce-scatter, followed by an allgather.
...
This commit was SVN r18132.
2008-04-11 04:06:29 +00:00
Jon Mason
08ead87604
Potential double free of locks
...
mca_btl_openib_endpoint_post_rr_nolock is freeing the endpoint lock on
the error case, but most/all of the functions calling this free the lock
regardless of its error case. Thus resulting is a double free of the
lock.
This commit was SVN r18131.
2008-04-10 21:15:01 +00:00
Ralph Castain
e050f37578
Cleanup a few warnings about initializing variables.
...
Remove an obsolete data value.
This commit was SVN r18129.
2008-04-10 19:15:16 +00:00
Rich Graham
70f3aab5f2
remove some code that is not needed.
...
This commit was SVN r18128.
2008-04-10 17:32:04 +00:00
Rich Graham
5c7db1e315
remove 2 race conditions in the buffer recycling logic.
...
This commit was SVN r18127.
2008-04-10 17:20:52 +00:00
Ralph Castain
851279fc9f
Consolidate the daemon wireup message into the launch message. The daemons don't need their contact info prior to the launch message anyway. This not only eliminates a job-wide communication from the startup procedure, but it also resolves a race condition reported when operating across highly distributed (i.e., cross-country) networks. In such scenarios, it proved possible for a daemon to receive its launch message -before- it had received the contact info message, even though the latter had been sent first!
...
This eliminates that problem...
This commit was SVN r18126.
2008-04-10 15:35:11 +00:00
Ralph Castain
4b798cf29a
Massage these platform files a little...
...
This commit was SVN r18125.
2008-04-10 15:32:41 +00:00