1
1
openmpi/ISSUES
Brian Barrett aa70a35fea * Sync trunk to r4977 of the tim branch
This commit was SVN r4978.

The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
  r4977
2005-03-22 00:31:17 +00:00

68 строки
2.1 KiB
Plaintext

Undecided timing:
-----------------
- if an MPI process fails (e.g., it seg faults), it causes orterun to
hang. This is with the rsh pls.
--> Looks like the problem is with what happens when you set the
state of the process in the soh to ORTE_PROC_STATE_ABORTED.
--> Ralph is looking at this
- if the daemon is not found or fails to start, orterun will hang. No
indication is given to the users that something went wrong.
--> Brian thinks he fixed this, but since he sets the state to
ORTE_PROC_STATE_ABORTED, it won't be really clear until the
above issue is fixed. But it at least tells you what went
wrong.
- $prefix/etc/hosts vs. $prefix/etc/openmpi-default-hostfile
--> Brian temporarily added symlink in $prefix/etc/ for
openmpi-default-hostfile -> hosts if there isn't already
a hosts file so that he doesn't have to create one every
time he does "rm -rf $prefix && make install". Will file
bug so that this can be fixed (and will fix in the trunk)
Pre-milestone:
--------------
- singleton mpi doesn't work
- Ralph: Populate orte_finalize()
Post-milestone:
---------------
- ras_base_alloc: doesn't allow for oversubscribing like this:
eddie: cpu=2
vogon: cpu=2 max-slots=4
mpirun -np 6 uptime
It barfs because it tries to evenly divide the remaining unallocated
procs across all nodes (i.e., 1 each on eddie/vogon) rather than
seeing that vogon can take the remaining 2.
- Jeff: TM needs to be re-written to use daemons (can't hog TM
connection forever)
- Jeff: make the mapper be able to handle app->map_data
- Jeff: add function callback in cmd_line_t stuff
- Jeff: does cmd_line_t need to *get* MCA params if a command line
param is not taken but an MCA param is available?
- consider empty string problem...
- ?: Friendlier error messages (e.g., if no nodes -- need something
meaningful to tell the user)
- Ralph: compare and set function in GPR
- Jeff: collapse MCA params from 3 names to 1 name
- ?: Apply LANL copyright to trunk (post all merging activity)
- Probably during/after OMPI/ORTE split:
- re-merge [orte|ompi]_pointer_array and [orte|ompi]_value_array