1
1
Граф коммитов

8472 Коммитов

Автор SHA1 Сообщение Дата
George Bosilca
b51b87a4aa The correct way to compute the difference between the actual size and the
expected size, based on the comment few lines before.

This commit was SVN r12235.
2006-10-20 19:33:55 +00:00
George Bosilca
d7268557a8 Complete the SM BTL changes. Now all displacements are ptrdiff_t and there is
no warnings about any issue with signed/unsigned.

This commit was SVN r12234.
2006-10-20 19:28:12 +00:00
Mohamad Chaarawi
08a9b6458c fixed the MPI_Translate_ranks issues reported earlier, where a rank of
MPI_PROC_NULL translates to MPI_PROC_NULL, and an MPI_GROUP_EMPTY as one of
the groups doesn't cause a segmentation fault, but returns MPI_UNDEFINED for
all ranks to be translated.

This commit was SVN r12233.
2006-10-20 19:13:49 +00:00
Tim Prins
28bf4d85ab A couple of small fixes:
- It is possible to leave a byslot/bynode routine and have cur_node_item be NULL, so check for that.
- After we do an allocation where the user has provided a map (i.e. with --host), cur_node_item is pointing into the map list, not the global list. Change it to point into the global list.

This commit was SVN r12232.
2006-10-20 19:00:17 +00:00
Ralph Castain
955d11fa7b The bookmark now respects slot assignments a little better. It will not oversubscribe the first node, but will take only what is available there before moving on.
See the comment in orte/mca/rmaps/round_robin/rmaps_rr.c if you want the details... :-)

This commit was SVN r12230.
2006-10-20 18:24:14 +00:00
Ralph Castain
ec0bb9ffda Fix the bookmark system - we now have children being correctly spawned where they should!
Also, I am no longer seeing any issue with the child job spawning its own daemons - this appears to be fixed. We still don't reuse the existing daemons, however, but that will come.

This commit was SVN r12229.
2006-10-20 18:05:16 +00:00
George Bosilca
c4b0d0c026 Update the Windows README file.
This commit was SVN r12228.
2006-10-20 17:59:57 +00:00
George Bosilca
c86214f420 Fix the SM BTL issues. The problem seems to come from the fact that
the maximum number of nodes on the SM file should be signed, as we use
the -1 to unlimit it.

This commit was SVN r12227.
2006-10-20 17:25:53 +00:00
Ralph Castain
c07d4e2510 Cleaner rendition now extended to other environments. Remove MCA params for backend procs that can cause trouble. Specifically, any directives on the selection of components for RDS, RAS, RMAPS, PLS, and RMGR can be bad mojo on the backend.
This patch will cause a problem for cnos, however, as there we want to specifically tell the backends to be "null". I'm working on that issue.

This commit was SVN r12225.
2006-10-20 16:50:13 +00:00
Ralph Castain
02efd07b60 Fix the MCA param passing issue, at least for rsh at the moment. I will clean this up and move it to the other environments once I shift back to a local computer.
This commit was SVN r12224.
2006-10-20 15:27:29 +00:00
Ralph Castain
b07a6b1d7a Fix a major typo that caused remote launch to crash - had something inside the wrong brace
This commit was SVN r12221.
2006-10-20 14:30:23 +00:00
Brian Barrett
37fad860b7 Grrr... Forgot that EXTRA_DIST and man_MANS are not set to include all the
possible things contained in the conditional like other rules are (for
example, a SOURCES rule in a conditional automatically has its files
added to the dist rules, even if that conditional isn't tru when
make dist occurs).  So the man files weren't in the tarball.

Put the EXTRA_DIST with the files explicitly listed outside any conditionals
so the man pages always end up in the tarball.

This commit was SVN r12220.
2006-10-20 14:15:38 +00:00
Tim Prins
7ec3287d3d Need a rule to make opal_wrapper.1
This commit was SVN r12217.
2006-10-20 12:04:59 +00:00
George Bosilca
10a79f4822 We always have to include the <PROJECT>_config.h as first include.
This commit was SVN r12216.
2006-10-20 07:01:52 +00:00
George Bosilca
06563b5dec Last set of explicit conversions. We are now close to the zero warnings on
all platforms. The only exceptions (and I will not deal with them
anytime soon) are on Windows:
- the write functions which require the length to be an int when it's
  a size_t on all UNIX variants.
- all iovec manipulation functions where the iov_len is again an int
  when it's a size_t on most of the UNIXes.
As these only happens on Windows, so I think we're set for now :)

This commit was SVN r12215.
2006-10-20 03:57:44 +00:00
George Bosilca
e81d38f322 Remove a function that was just a proof of concept. The same approach is
not used by the TotalView support.

This commit was SVN r12214.
2006-10-20 03:34:16 +00:00
George Bosilca
527bb7a197 Remove a double ;
This commit was SVN r12213.
2006-10-20 03:28:51 +00:00
Brian Barrett
4dad3ef3ef Follow on to r12146. For platforms that dont' have a ptrdiff_t definition,
provide one for the internals of Open MPI.  For mpi.h, typedef MPI_Aint
either to ptrdiff_t or whatever we used as ptrdiff_t if that type doesn't
actually exist.

This commit was SVN r12212.

The following SVN revision numbers were found above:
  r12146 --> open-mpi/ompi@8852c00c36
2006-10-20 03:24:59 +00:00
George Bosilca
f43d4fa4f2 Last set of datatype updates. Mostly function prototypes updates and
explicit casting.

This commit was SVN r12211.
2006-10-20 02:31:50 +00:00
George Bosilca
dc7bcabb22 type.
This commit was SVN r12210.
2006-10-20 02:30:33 +00:00
George Bosilca
b0a03fae4d Let the wrapper compiler complain when it does not find one of the
configuration file.

This commit was SVN r12209.
2006-10-20 02:29:48 +00:00
George Bosilca
66eb007b22 New version of the Windows compatibility file.
This commit was SVN r12208.
2006-10-20 02:28:41 +00:00
George Bosilca
2aa3e51223 Nothing relevant. Only a set of castings to have a clean compile on
Windows. The cl.exe compiler is pretty good at complaining about
any kind of non explicit cast.

This commit was SVN r12207.
2006-10-20 02:25:50 +00:00
George Bosilca
7982a23bde ORTE_DECLSPEC should be ...
This commit was SVN r12206.
2006-10-20 02:23:54 +00:00
George Bosilca
5a939e21b2 Populate the file with ORTE_DECLSPEC declarations.
This commit was SVN r12205.
2006-10-20 02:19:30 +00:00
Tim Prins
45a4f2c7ed Fix a minor problem in variable naming in these configure macros.
Thanks to Martin Audet for reporting this on the users list.

This commit was SVN r12203.
2006-10-19 23:35:14 +00:00
Tim Prins
ade94b523b Fixed a number of issues related to resource allocation:
- Simplified the logic of the ras modules by moving the attribute handling into the base allocation function. This allows us to decide how to allocate based on the situation, and solves some of the allocation problems we were having with comm_spawn.
- moved the proxy component into the base. This was done because we always want to call the proxy functions if we are not on a HNP regardless of the attributes passed. 
- Got rid of the hostfile component. What little logic was in it was moved into the base to deal with other circumstances. The hostfile information is currently being propagated into the registry by the RDS, so we just use what is already in the registry.
- renamed some slurm function so that they have the proper prefix. Not strictly necessary as they were static, but it makes debugging much easier.
- fixed a buglet in the round_robin rmaps where we would return an error when really no error occured.

I tried to make proper corrections to all the ras modules, but I cannot test all of them.

This commit was SVN r12202.
2006-10-19 23:33:51 +00:00
Ralph Castain
ab196c3121 Okay, this fixes the problem of MCA params spreading too far. Sorry for the multiple corrections.
This commit was SVN r12201.
2006-10-19 22:51:02 +00:00
George Bosilca
caefd6d0ee Do not leak memory. Allocate the intermediary buffer only when we really need it
(not leafs) and release on the same way.

This commit was SVN r12200.
2006-10-19 22:20:33 +00:00
George Bosilca
26b33ec2d7 If there is just one node, we don't need a decision function, just do the copy
and return.

This commit was SVN r12199.
2006-10-19 22:19:36 +00:00
George Bosilca
c788f0dd51 Apparently, the MCA params are being loaded into the environment params
of the individual app_contexts as well. Clear them of "bad" ones.

This commit was SVN r12198.
2006-10-19 21:19:08 +00:00
Ralph Castain
d0d3d7fd41 Apparently, the MCA params are being loaded into the environment params of the individual app_contexts as well. Clear them of "bad" ones.
This commit was SVN r12197.
2006-10-19 20:49:35 +00:00
Ralph Castain
382f954fff Fix a bug in the way we saved and passed environments to child processes on remote nodes. The problem was that MCA directives for component selection were being passed back to the children. However, now that we only allow certain components to operate on HNPs, this caused the children to bomb out of orte_init.
This commit was SVN r12196.
2006-10-19 20:35:55 +00:00
Brian Barrett
204f5b8f52 - Clean up wrapper compiler man pages during maintainer-clean, since
they might require special tools (not sure if sed with multiple -e
    arguments is totally portable)
  - ignore the opalcc.1 man page.  Couldn't do this in the previous
    man page commit (r12192) because I was removing opalcc.1 in that
    commit.

This commit was SVN r12194.

The following SVN revision numbers were found above:
  r12192 --> open-mpi/ompi@581a4b0a4e
2006-10-19 20:14:40 +00:00
Ralph Castain
263f4379e8 Clean up an error in the mapper that caused "-hosts" to bomb.
Update the mapper so it correctly points to the next node to be used if we are mapping by slots. As it was, if we had an app_context that used only one slot on a node, the next app_context would start on the next node - leaving a blank slot in-between.

This commit was SVN r12193.
2006-10-19 18:57:29 +00:00
Brian Barrett
581a4b0a4e A few cleanups to the wrapper compiler build system / man pages:
- Only install opal{cc,c++} and orte{cc,c++} if configured with
     --with-devel-headers.  Right now, they are always installed, but 
    there are no header files installed for either project, so there's
    really not much way for a user to actually compile an OPAL / ORTE
    application.

  - Drop support for opalCC and orteCC.  It's a pain to setup all the 
    symlinks (indeed, they are currently done wrong for opalCC) and 
    there's no history like there is for mpiCC.

  - Change what is currently opalcc.1 to opal_wrapper.1 and add some
    macros that get sed'ed so that the man pages appear to be 
    customized for the given command.  

  - Install the wrapper data files even if we compiled with 
    --disable-binaries.  This is for the use case of doing multi-lib
    builds, where one word size will only have the library built, but 
    we need both set of wrapper data files to piece together to 
    activate the multi-lib support in the wrapper compilers.

This commit was SVN r12192.
2006-10-19 18:34:17 +00:00
George Bosilca
3eb2f90ceb For the recurvise doubling correctly compute the closest power of 2 number of
nodes.

This commit was SVN r12191.
2006-10-19 17:14:57 +00:00
George Bosilca
041fcb8d18 Update the barrier decision function.
This commit was SVN r12190.
2006-10-19 17:14:01 +00:00
Tim Prins
81d400ddfd break when they are equal, not not equal
This commit was SVN r12182.
2006-10-18 21:47:01 +00:00
George Bosilca
18d119bc06 No more warnings.
This commit was SVN r12181.
2006-10-18 21:10:11 +00:00
Tim Prins
ab964d096a Need a terminating NULL...
This commit was SVN r12180.
2006-10-18 20:52:31 +00:00
Ralph Castain
d0eb7d7216 Complete the attribute management functions.
Modify the mapper to better bookmark its stopping place each time, and to pick up the next time from there. This needs to be validated on a multi-node system.

Fix a major memory corruption problem in the registry put/get functions that was doing multiple free's. Not sure how valgrind missed this one, though it only occurred in specific circumstances (such as comm_spawn).

This commit was SVN r12179.
2006-10-18 20:02:16 +00:00
Galen Shipman
2036bf5c3c make smart and dumb compilers happy
This commit was SVN r12178.
2006-10-18 19:33:39 +00:00
George Bosilca
c9da782804 Keep only one function to get the size of a datatype.
This commit was SVN r12170.
2006-10-18 17:33:01 +00:00
George Bosilca
3db5c0487d typos.
This commit was SVN r12168.
2006-10-18 17:12:25 +00:00
George Bosilca
a1c9a374eb Remove all the warnings from the data-type engine testing.
This commit was SVN r12167.
2006-10-18 17:00:43 +00:00
George Bosilca
21ade43b96 Remove a non reacheable statement.
This commit was SVN r12166.
2006-10-18 16:43:55 +00:00
Rainer Keller
47b24a0603 - Now the branch is done, linearize access regarding
request handling.  Buys a little bit on IMB, no
   functional change, otherwise.

This commit was SVN r12165.
2006-10-18 16:11:50 +00:00
Ralph Castain
f4a458532b This doesn't totally resolve the comm_spawn problem, but it helps a little. I'll continue working on it and hope to resolve it completely shortly. The issue primarily centers on where to start mapping the child job's processes, and how to deal with oversubscription that might result. At the moment, I am trying to resolve the first issue first (hey, that even sounds right!).
This change does a couple of things:

1. Since the USE_PARENT_ALLOC attribute is a directive about regarding allocation of resources to a job, it more properly should be an attribute of the RAS. Change the name to reflect that and move the attribute define to the ras_types.h file.

2. Add the attributes list to the RMAPS map_job interface. This provides us with the desired flexibility to dynamically specify directives for mapping. The system will - in the absence of any attribute-based directive - default to the values provided in the MCA parameters (either from environment or command-line interface).

This commit was SVN r12164.
2006-10-18 14:01:44 +00:00
Gleb Natapov
252a9cea34 Fix bug in vma rcache.
This commit was SVN r12163.
2006-10-18 10:55:01 +00:00