Aurelien Bouteiller
0f311ed824
Make sure the function returns NULL when no elan adapter is available instead of a random value.
...
This commit was SVN r18136.
2008-04-11 21:03:01 +00:00
Aurelien Bouteiller
20592cbcbf
Fixes a warning about mallocing 0 bytes when no elan adapter is available.
...
This commit was SVN r18135.
2008-04-11 20:59:12 +00:00
Aurelien Bouteiller
921a6ce3d4
Process with different jobid can kwon connet/accept to each other.
...
This commit was SVN r18134.
2008-04-11 15:40:59 +00:00
Rich Graham
249445d61f
added reduce-scatter followed by gather to root.
...
This commit was SVN r18133.
2008-04-11 13:49:08 +00:00
Rich Graham
a6bdbfab97
implement allreduce as reduce-scatter, followed by an allgather.
...
This commit was SVN r18132.
2008-04-11 04:06:29 +00:00
Jon Mason
08ead87604
Potential double free of locks
...
mca_btl_openib_endpoint_post_rr_nolock is freeing the endpoint lock on
the error case, but most/all of the functions calling this free the lock
regardless of its error case. Thus resulting is a double free of the
lock.
This commit was SVN r18131.
2008-04-10 21:15:01 +00:00
Ralph Castain
e050f37578
Cleanup a few warnings about initializing variables.
...
Remove an obsolete data value.
This commit was SVN r18129.
2008-04-10 19:15:16 +00:00
Rich Graham
70f3aab5f2
remove some code that is not needed.
...
This commit was SVN r18128.
2008-04-10 17:32:04 +00:00
Rich Graham
5c7db1e315
remove 2 race conditions in the buffer recycling logic.
...
This commit was SVN r18127.
2008-04-10 17:20:52 +00:00
Ralph Castain
851279fc9f
Consolidate the daemon wireup message into the launch message. The daemons don't need their contact info prior to the launch message anyway. This not only eliminates a job-wide communication from the startup procedure, but it also resolves a race condition reported when operating across highly distributed (i.e., cross-country) networks. In such scenarios, it proved possible for a daemon to receive its launch message -before- it had received the contact info message, even though the latter had been sent first!
...
This eliminates that problem...
This commit was SVN r18126.
2008-04-10 15:35:11 +00:00
Ralph Castain
4b798cf29a
Massage these platform files a little...
...
This commit was SVN r18125.
2008-04-10 15:32:41 +00:00
Edgar Gabriel
4964434205
reverting commit 18122, since the commit was executed accidentally in the
...
wring directory. The UH copyrights do belong into this file (i.e. because of
the fix which is in the 1.2 branch, the UH copyright notes are in the header
there alreary), but I want to have the proper log for that.
This commit was SVN r18124.
2008-04-10 15:09:31 +00:00
Edgar Gabriel
5989fa570c
Sorry, previous commit was in the wrong directory. This is the real fix (have
...
to undo 1822).
The verification of recvcount==0 and rank = root was braking
inter-communicator scatter, since the root (root==MPI_ROOT) might very well
have recvcount=0. The same fix has been applied to gather.c just the other way
round.
Fixes the bug reported on the mainling list by Martin Audet. If there is a
1.2.7 this fix might be worthwhile porting it over.
Please note, that while the test works now for basic and for inter, we get a
0byte malloc warning from the inter module, which we still have to fix in a
separate patch.
This commit was SVN r18123.
2008-04-10 15:03:14 +00:00
Edgar Gabriel
f87830767a
the verification of recvcount==0 and rank = root was braking
...
inter-communicator scatter, since the root (root==MPI_ROOT) might very well
have recvcount=0. The same fix has been applied to gather.c just the other way
round.
Fixes the bug reported on the mainling list by Martin Audet. If there is a
1.2.7 this fix might be worthwhile porting it over.
Please note, that while the test works now for basic and for inter, we get a
0byte malloc warning from the inter module, which we still have to fix in a
separate patch.
This commit was SVN r18122.
2008-04-10 14:58:51 +00:00
Ralph Castain
57e3e86cda
Use the proper exit code for mpirun to indicate an error when something goes wrong during launch (in scenarios where the procs don't report the problem directly themselves)
...
This commit was SVN r18121.
2008-04-10 09:15:08 +00:00
Ralph Castain
e7d0dae89d
Ensure we update the daemon collective trees if num_procs changes, but only if it changes
...
This commit was SVN r18120.
2008-04-10 03:44:18 +00:00
Ralph Castain
22343e6e0b
Given total lack of interest/support from the folks behind these environments, and the fact that we can now scale so well with our own daemons, it seems unlikely that we will be able to pursue direct and/or standalone launch in these environments. If that situation ever changes, it is easy enough to revive the effort since little had really been done to-date.
...
Meantime, no reason to continue dragging these around.
This commit was SVN r18119.
2008-04-10 02:54:13 +00:00
Ralph Castain
dc2f88b9f0
Now that we have the daemon collectives, the unity routed module no longer needs the "hack" we inserted a week ago to tell the daemons how to talk directly to all the application procs. The modex and barrier messages flow cleanly across the daemons and are "dropped" into the procs where required.
...
Add some insurance to make certain that the daemons' number of procs only gets updated when it absolutely is intended.
This commit was SVN r18118.
2008-04-10 02:45:42 +00:00
Ralph Castain
0b3122ee2f
Update the cnos module - should (hopefully) compile and work...
...
This commit was SVN r18117.
2008-04-09 22:33:00 +00:00
Ralph Castain
86b4ae5970
Remove a generated file from the repository - shouldn't have been there
...
This commit was SVN r18116.
2008-04-09 22:13:51 +00:00
Ralph Castain
3a0d09300b
Fully implement the inbound binomial allgather for daemon-based collectives. Supports both modex and barrier operations.
...
Comm_spawn still uses the rank=0 method - shifting that algo to the daemons is under study.
This commit was SVN r18115.
2008-04-09 22:10:53 +00:00
Ralph Castain
7cb1e72f76
Okay, let's really get those libz references out of there!
...
This commit was SVN r18114.
2008-04-09 22:05:09 +00:00
Ralph Castain
95d7e177c6
Not really a test, but a useful tool for testing computation of binomial trees
...
This commit was SVN r18113.
2008-04-09 21:58:42 +00:00
Ralph Castain
3120428f0f
Update several platform files to remove the libz dependency, add a couple for the Mac
...
This commit was SVN r18112.
2008-04-09 21:57:59 +00:00
Rich Graham
c6783549ef
getting old
...
This commit was SVN r18110.
2008-04-09 16:55:16 +00:00
Rich Graham
1a20c3ce51
more debug.
...
This commit was SVN r18109.
2008-04-09 16:19:52 +00:00
Rich Graham
e7e18303f6
more debug.
...
This commit was SVN r18108.
2008-04-09 15:10:58 +00:00
Rich Graham
b14c6b17d5
adding debug output.
...
This commit was SVN r18107.
2008-04-09 13:32:01 +00:00
Ralph Castain
11c6773c83
Commit a patch from Brian that fixes potential segfaults in systems where IPv6 include files are found, but the kernel doesn't actually support IPv6.
...
This commit was SVN r18106.
2008-04-09 12:53:24 +00:00
Rich Graham
10434fb2f1
add barrier synchorinzation at the end of the module init, to
...
avoid initializing shared memory variables in use.
This commit was SVN r18105.
2008-04-09 03:44:40 +00:00
Rich Graham
19bb1a2e86
fix initialization bug.
...
This commit was SVN r18104.
2008-04-08 23:34:06 +00:00
Donald Kerr
38e298cc9a
report error message in all libs, not just debug
...
This commit was SVN r18103.
2008-04-08 22:58:28 +00:00
Rich Graham
a69a8d9626
initialize the flags.
...
This commit was SVN r18102.
2008-04-08 22:16:39 +00:00
Rich Graham
8765a2bbdd
more debug code.
...
This commit was SVN r18101.
2008-04-08 20:38:20 +00:00
Rich Graham
08becf33b5
add more debugging.
...
This commit was SVN r18100.
2008-04-08 18:44:50 +00:00
Rich Graham
aa1b7dd406
more debug
...
This commit was SVN r18099.
2008-04-08 03:56:47 +00:00
Rich Graham
0c18bdeff7
more debug code.
...
This commit was SVN r18098.
2008-04-08 03:04:20 +00:00
Rich Graham
9d5a7238df
Add some debugging code.
...
This commit was SVN r18097.
2008-04-07 23:20:15 +00:00
Rich Graham
fa696734d5
add some debug code.
...
This commit was SVN r18096.
2008-04-07 21:03:23 +00:00
Shiqing Fan
28746bbcdb
Remove the memchecker macro in pml base request, used in req_wait.c, which actually is in the wrong place. Instead, one simple call from send_request_free and recv_request_free(already done) will do all the work, fast and clean.
...
This commit was SVN r18095.
2008-04-07 17:46:50 +00:00
Tim Mattox
3f0f09fd1f
Update the NEWS file for the release of 1.2.6, plus spelling fixes.
...
This commit was SVN r18093.
2008-04-07 16:39:58 +00:00
George Bosilca
9e0bc441a6
Make this header ISO C compliant.
...
This commit was SVN r18090.
2008-04-07 14:47:13 +00:00
Shiqing Fan
d22de11e8e
Remove the running debugger function.
...
This commit was SVN r18087.
2008-04-07 10:40:02 +00:00
Shiqing Fan
c74b488cdb
Forgot to comment this function out at moment.
...
This commit was SVN r18086.
2008-04-07 10:33:11 +00:00
Shiqing Fan
a1e5df1cc9
Use the new memchecker function call which is based on convertor.
...
Remove one unnecessary call.
This commit was SVN r18085.
2008-04-07 07:52:04 +00:00
Shiqing Fan
a913a60c24
Add a new function for setting memory states based on structure convertor.
...
Benefits of this function will be using less memory, compactness and better performance. Thanks to George.
Keep the old memchecker function as well in case of convertor is not available.
This commit was SVN r18084.
2008-04-07 07:47:27 +00:00
Gleb Natapov
713a27dc71
Counter of created RDMA channels should be incremented immediately after channel
...
creation (not in control message completion) otherwise more than max_eager_rdma
channel may be created.
This commit was SVN r18082.
2008-04-06 13:48:45 +00:00
Rich Graham
1b54e8b76e
fix buffer management for nb-barrier.
...
This commit was SVN r18081.
2008-04-05 21:59:04 +00:00
Ralph Castain
5e6dc24e62
Fix ompi-server so it works with unity routed module - still not working with tree routing.
...
Cleanup debug flag so it activates debugging on the data server code itself
This commit was SVN r18080.
2008-04-04 19:17:28 +00:00
Tim Prins
313edd8955
- Fix a problem reported on the users list where we would segfault in finalize after calling spawn if the user did not call MPI_Comm_disconnect
...
- Fix the app context constructor so it initializes all the fields.
This commit was SVN r18079.
2008-04-04 15:07:39 +00:00