1
1
Граф коммитов

398 Коммитов

Автор SHA1 Сообщение Дата
Ralph Castain
cb48fd52d4 Implement the MPI_Info part of MPI-3 Ticket 313. Add an MPI_info object MPI_INFO_GET_ENV that contains a number of run-time related pieces of info. This includes all the required ones in the ticket, plus a few that specifically address recent user questions:
"num_app_ctx" - the number of app_contexts in the job
"first_rank" - the MPI rank of the first process in each app_context
"np" - the number of procs in each app_context

Still need clarification on the MPI_Init portion of the ticket. Specifically, does the ticket call for returning an error is someone calls MPI_Init more than once in a program? We set a flag to tell us that we have been initialized, but currently never check it.

This commit was SVN r27005.
2012-08-12 01:28:23 +00:00
George Bosilca
f7528bb404 Remove unused variables.
This commit was SVN r26966.
2012-08-08 12:43:13 +00:00
George Bosilca
2303cd0bdb Remove initialized but unused variables.
This commit was SVN r26959.
2012-08-07 12:05:25 +00:00
Jeff Squyres
0b7b3feba9 Minor fix for the command line parser: we didn't previously
distinguish between unknown ''options'' (i.e., command line options
that are registered and have some meaning) and unknown ''tokens''
(i.e., strings that do not begin with "-").  Hence, if you did:

   mpirun --fo my_mpi_program

(when perhaps you meant to type "--foo", mpirun would complain that no
such executable "--fo" existed.  That is ''correct,'' but perhaps not
completely useful.  It is more accurate for mpirun to report that
there is no such "--fo" option.

This change to cmd_line.c makes it so that we will ''always'' report
errors regarding tokens that begin with "-".

This commit was SVN r26953.
2012-08-06 17:13:08 +00:00
Ralph Castain
bd8b4f7f1e Sorry for mid-day commit, but I had promised on the call to do this upon my return.
Roll in the ORTE state machine. Remove last traces of opal_sos. Remove UTK epoch code.

Please see the various emails about the state machine change for details. I'll send something out later with more info on the new arch.

This commit was SVN r26242.
2012-04-06 14:23:13 +00:00
Jeff Squyres
97b3603036 A bunch of fixes and improvements to Open MPI's various command line tools.
* fixed some bugs where "unknown" tokens were allowed on the command
   line (which should really only be used for ortertun).
 * if an unknown token is encountered, print a short error to stderr
   and quit with a nonzero exit status
 * if we don't find the right number of parameters to an option, print
   a short error to stderr and quit with a nonzero exit status
 * when --help is given, print the help message to stdout (not stderr)
   and quit with a zero exit status
 * added --showme:help option to the wrapper compilers
 * updated docs in opal/util/cmd_line.h
 * other small/miscellaneous CLI parsing bugs in various tools

I won't bore you with what we did before.  :-)  Here's some examples
of what the new behavior looks like:

{{{
% ompi_info --bogus
ompi_info: Error: unknown option "--bogus"
Type 'ompi_info --help' for usage.
% ompi_info --param bogus
ompi_info: Error: option "--param" did not have enough parameters (2)
Type 'ompi_info --help' for usage.
%
}}}

This commit was SVN r26072.
2012-02-29 17:52:38 +00:00
Ralph Castain
8446673dc4 Update the cmd line parser to return an error if someone forgets to include a numeric parameter to a cmd line option that requires one. Can't do anything about options that require strings, but we can at least bark when someone forgets the "-np N" argument.
This commit was SVN r26068.
2012-02-28 20:33:53 +00:00
Jeff Squyres
5f9ac93455 Fix suggested by Paul Hargrove to elminate a dangerous trailing context
This commit was SVN r25983.
2012-02-21 13:29:58 +00:00
Jeff Squyres
6bb98f072f Fix typo; could hypothetically fix the problem reported by Paul
Hargrove: http://www.open-mpi.org/community/lists/devel/2012/02/10483.php 

This commit was SVN r25982.
2012-02-21 13:19:09 +00:00
Ralph Castain
47c64ec837 Roll in Java bindings per telecon discussion. Man pages still under revision
This commit was SVN r25973.
2012-02-20 22:12:43 +00:00
Jeff Squyres
435aea9ccd A better solution -- just look for !__linux!__.
This commit was SVN r25841.
2012-01-31 20:27:33 +00:00
Jeff Squyres
538cdce8fb Add checks for !__linux and !__linux__, per Paul Hargrove's analysis:
http://www.open-mpi.org/community/lists/devel/2012/01/10283.php.  Also
remove some unused #defines.

This commit was SVN r25836.
2012-01-31 16:45:50 +00:00
Jeff Squyres
6fbbfd0f7a Gah! r25545 acidentally included ''waaaay'' more stuff than it was
supposed to.  I.e., half-baked/not complete stuff.

This commit backs out all of r25545.  Sorry folks!

This commit was SVN r25546.

The following SVN revision numbers were found above:
  r25545 --> open-mpi/ompi@7f9ae11faf
2011-11-29 23:24:52 +00:00
Jeff Squyres
7f9ae11faf Per http://www.open-mpi.org/community/lists/users/2011/11/17862.php,
to make MPI_IN_PLACE (and other sentinel Fortran constants) work on OS
X, we need to use the following compiler (linker) flag:

    -Wl,-commons,use_dylibs 

So if we're compiling on OS X, test to see if that flag works with the
compiler.  If so, add it to the wrapper FFLAGS and FCFLAGS (note that
per a future update, we'll only have one Fortran compiler anyway).

Fixes trac:1982.  

This commit was SVN r25545.

The following Trac tickets were found above:
  Ticket 1982 --> https://svn.open-mpi.org/trac/ompi/ticket/1982
2011-11-29 23:05:54 +00:00
Jeff Squyres
21dc0b44e1 Fix minor typo in comment
This commit was SVN r25542.
2011-11-29 20:39:53 +00:00
Christopher Yeoh
7e7701e7fc Removes misleading debug warning from opal_free when a NULL
pointer is passed to it.
Fixes trac:2884

This commit was SVN r25430.

The following Trac tickets were found above:
  Ticket 2884 --> https://svn.open-mpi.org/trac/ompi/ticket/2884
2011-11-03 23:57:26 +00:00
Rainer Keller
4e6a6fc146 - Check, whether the compiler supports __builtin_clz (count leading
zeroes);
   if so, use it for bit-operations like opal_cube_dim and opal_hibit.
   Implement two versions of power-of-two.
   In case of opal_next_poweroftwo, this reduces the average execution
   time from 83 cycles to 4 cycles (Intel Nehalem, icc, -O2, inlining,
   measured rdtsc, with loop over 2^27 values).
   Numbers for other functions are similar (but of course heavily depend
   on the usage, e.g. opal_hibit() with a start of 4 does not save
   much).  The bsr instruction on AMD Opteron is also not as fast.

 - Replace various places where the next power-of-two is computed.
   
   Tested on Intel Nehalem Cluster with openib, compilers GNU-4.6.1 and
   Intel-12.0.4 using mpi_testsuite -t "Collective" with 128 processes.

This commit was SVN r25270.
2011-10-11 22:49:01 +00:00
Swen Boehm
08b4322a1a patched the lex files to not issue the following compiler warning:
'yyunput' defined but not used

This commit was SVN r25246.
2011-10-10 18:13:04 +00:00
George Bosilca
649af6c925 Enumerated mixed with another type (int) is tolerated but
easily fixable.

This commit was SVN r25241.
2011-10-09 03:54:52 +00:00
Ralph Castain
da9bbf68ec Fix the output of error strings. Every convertor is returning OPAL_SUCCESS, so you have to check each convertor to find which one this error belongs to, and then run ONLY that convertor.
This commit was SVN r25009.
2011-08-08 04:10:40 +00:00
Ralph Castain
2af867d26f Don't segfault if show_help is called prior to calling opal_init_util
This commit was SVN r24825.
2011-06-27 16:35:19 +00:00
Ralph Castain
4c06c9c07c Simplify the code a little bit by recognizing that end=start isn't an error, but just indicates a partial address typical of CIDR notation.
This commit was SVN r24757.
2011-06-07 11:33:22 +00:00
Ralph Castain
666fdeab8f Okay to return an error on end=start of string conversion so long as the strlen > 0, so restore that error check.
This commit was SVN r24756.
2011-06-07 03:20:01 +00:00
Ralph Castain
f3cae3d6f3 Cleanup the handling of if_include and if_exclude arguments based on CIDR notation.
Fix a bug in the new code that prevented the system from correctly matching addresses.

Remove comments in the show-help text indicating that we would continue in the face of incorrect specifications - leave that to the calling layer to decide.

Modify the new opal_ifmatches so it returns error codes letting the caller better understand the result.

Modify the oob to ensure we abort if we don't find interfaces matching specified constraints, and that we do so without multiple error messages.

NOTE: we have a conflict in our standards. We have been using comma-delimited lists of interfaces for all our params. However, one param - opal_net_private_ipv4 - now uses semicolons instead of comma separators. No idea why, but it is confusing.

This commit was SVN r24755.
2011-06-07 02:09:11 +00:00
George Bosilca
910a289e97 Remove the explicit "attemt to continue".
This commit was SVN r24754.
2011-06-07 01:27:08 +00:00
George Bosilca
7ebd094ecf Cleanup the IPv4 address parsing, and correct the error message.
This commit was SVN r24750.
2011-06-06 03:08:02 +00:00
Ralph Castain
1491d52bd7 Extend the parsing capability of the oob tcp module's if_include and if_exclude options to support subnet+mask notation, and to handle virtual IP addresses (it was previously having problems distinguishing between "eth1" and "eth1.3").
This commit was SVN r24747.
2011-06-05 19:16:42 +00:00
Ralph Castain
486041f89d Get rid of the annoying error messages when setrlimit fails, which seems to be a constant problem on the Mac. Don't use the changed values for max limits if the setrlimit call failed.
This commit was SVN r24703.
2011-05-17 03:27:43 +00:00
Ralph Castain
a3e43594a4 Extend node stats to include additional memory info. Change "darwin" pstat module to "test" as we don't really know how to get all the stat info for darwin.
Add a new OPAL_ERROR_LOG macro similar to the ORTE_ERROR_LOG one.

This commit was SVN r24692.
2011-05-08 14:45:16 +00:00
George Bosilca
34abbce82c More accurate and trustworthy descriptions of the netmask exist.
Interested readers can quench their curiosity either with one
of the Richard Stevens books (ISBN 9780201633467) or the
Wikipedia page (http://en.wikipedia.org/wiki/Subnetwork).

This commit was SVN r24680.
2011-05-03 21:59:51 +00:00
Ralph Castain
257473ebca Remove an extra "break" - thanks to Rainer for pointing it out.
This commit was SVN r24667.
2011-05-02 12:20:37 +00:00
Ralph Castain
7b29a6153e Cover all the netmask values
This commit was SVN r24665.
2011-04-29 17:56:15 +00:00
Shiqing Fan
4490fdbd34 Add the initial support for MinGW and MSYS.
Correctly check the dependencies of MSYS env.
Set up configure include and lib path for building the package.
update a few more CMake scripts.

This commit was SVN r24663.
2011-04-29 14:42:07 +00:00
Jeff Squyres
16d8e9216b Ran across this comment about i18n support, so I figured I'd update
it.  :-)

This commit was SVN r24631.
2011-04-22 12:14:20 +00:00
George Bosilca
eb8383802e ret might have been used uninitialized. Not anymore.
This commit was SVN r24452.
2011-02-24 03:02:48 +00:00
Shiqing Fan
baad4e1844 fix a non if-controlled brace.
This commit was SVN r24428.
2011-02-22 11:45:43 +00:00
Ralph Castain
e22262602e Extend the opal output code to support systems that cannot allow stdout/err to be output to console or files. This occurs in some embedded environments where file systems are in flash and consoles are redirected to NULL.
Add three new envars (not MCA params!) that control this behavior (see output.h for explanation).

This commit was SVN r24422.
2011-02-21 21:42:59 +00:00
Ralph Castain
bf1cff3711 Plug a couple of additional memory leaks - try to highlight a little better that strings returned from reg_string_name must be freed by caller
This commit was SVN r24383.
2011-02-14 20:58:22 +00:00
Ralph Castain
b5de068533 Clean up an error in r24371 - can't use a const parameter as target in asprintf as it changes the value of the address.
Add some new proc/job states

Rename a constant to reflect coming change - remove the arbitrary difference between restarting a proc locally and relocating it to another node in terms of the number of restarts allowed.

Add pretty-print of signals for "proc aborted due to signal" reports.

This commit was SVN r24378.

The following SVN revision numbers were found above:
  r24371 --> open-mpi/ompi@93d28a5792
2011-02-14 19:29:09 +00:00
Abhishek Kulkarni
93d28a5792 Change opal_err2str_fn_t to return the error string as an argument.
This means that the converters (opal_err2str, orte_err2str) can now
return NULL as a "silent error". The return value of opal_err2str_fn_t
is the status of the operation (OPAL_SUCCESS or OPAL_ERROR).

This fixes the "Unknown error" message issues on the trunk.

This commit was SVN r24371.
2011-02-13 16:09:17 +00:00
Nysal Jan
92e06b0a1f Missed this change suggested by Terry
This commit was SVN r24364.
2011-02-08 04:06:52 +00:00
Nysal Jan
a31025bb48 Fix pty setup code on AIX
This commit was SVN r24363.
2011-02-08 02:54:47 +00:00
Abhishek Kulkarni
d711c5a4b1 SOS fix for the Studio compilers (Thanks to Terry for spotting this).
This commit was SVN r24355.
2011-02-03 22:36:28 +00:00
Abhishek Kulkarni
3243b16bb3 Decode SOS error code before checking it with the native error code.
This commit was SVN r24281.
2011-01-20 23:21:38 +00:00
Ralph Castain
ac1853b5d8 Took me a couple of days, but finally tracked this one down. Some compilers/glibc's don't like composite test statements in a return and just randomly pick one of the two options.
So....don't do that!!!

This commit was SVN r24212.
2011-01-10 16:29:42 +00:00
Jeff Squyres
a525e70f46 Convert "opal_show_help" to be a global variable pointer.
It is statically initialized to the real back-end OPAL show_help
function.  During orte_show_help_init(), the variable is re-assigned
with the value of the back-end ORTE show_help function (the one that
does error message aggregation).  

Therefore, anything that calls opal_show_help() after a certain point
in orte_init() will have their show_help messages be aggregated.
w00t!  Even code down in OPAL -- that has no knowledge of ORTE -- will
have their messages aggregated.  '''Double w00t!'''

During orte_show_help_finalize(), we restore the original pointer
value so that it something calls opal_show_help() after
orte_finalize(), it'll still work properly (but it won't be
aggregated).  

This commit was SVN r24185.
2010-12-16 23:00:25 +00:00
Terry Dontje
b3f2ac8d46 removed direct include of stdbool.h from event.h that was causing studio C++ issues. Also removed include of stdbool.h in a couple other places since it was already being pulled in via opal_config_bottom.h.
This commit was SVN r23963.
2010-10-27 20:47:42 +00:00
Ralph Castain
bab990d812 Revert r23928 as being the incorrect fix. The correct fix is not to include ipv6 interfaces when ipv6 support was not requested.
This commit was SVN r23930.

The following SVN revision numbers were found above:
  r23928 --> open-mpi/ompi@7394f6d167
2010-10-25 14:31:18 +00:00
Ralph Castain
7394f6d167 Silence warnings about IPV6 sa_family not known when ipv6 support is not enabled in configure
This commit was SVN r23928.
2010-10-25 13:56:23 +00:00
Jeff Squyres
73bcc4a36b Fix mistake that came in via the ompi-agen tree in r23764. The mistake wasn't part of the core autogen upgrade; it was an additional 'bonus' cleanup. Oops. The mistake will always create a set of directories under installdir, even if you do not --with-devel-headers. The set of directories will be empty, but still -- they should not be there at all. This commit fixes that -- the directories are not created at all if you do not --with-devel-headers
This commit was SVN r23801.

The following SVN revision numbers were found above:
  r23764 --> open-mpi/ompi@40a2bfa238
2010-09-24 22:53:28 +00:00