1
1

3690 Коммитов

Автор SHA1 Сообщение Дата
Nathan Hjelm
669f0afd14 ugni: poll smsg mailbox until it is empty
This commit was SVN r25794.
2012-01-26 20:50:09 +00:00
Nathan Hjelm
2a83297f96 silence vader warnings
This commit was SVN r25793.
2012-01-26 20:07:33 +00:00
Mike Dubman
6c954ad43f set mxm to call opal_progress in tight loops
This commit was SVN r25788.
2012-01-26 18:33:43 +00:00
Nathan Hjelm
3d9bc68435 reorder udcm init/finalize code. fixes trac:2973
This commit was SVN r25787.

The following Trac tickets were found above:
  Ticket 2973 --> https://svn.open-mpi.org/trac/ompi/ticket/2973
2012-01-26 16:28:55 +00:00
Shiqing Fan
2c9a4beffd Add and remove a few components for windows build.
This commit was SVN r25775.
2012-01-25 09:01:27 +00:00
Nathan Hjelm
7b9bf6fe41 ugni: remove another erroneous error message
This commit was SVN r25768.
2012-01-23 21:23:01 +00:00
Nathan Hjelm
f3b60062cb ugni: remove erroneous error message
This commit was SVN r25767.
2012-01-23 21:05:24 +00:00
Nathan Hjelm
521546aaa3 bug fix: ugni: pack only as many bytes as the pml requested
This commit was SVN r25766.
2012-01-23 17:21:45 +00:00
Ralph Castain
f03b82ab0a Don't fiddle with the port_name memory as, per standard, this is an input-only parameter
This commit was SVN r25756.
2012-01-20 13:15:41 +00:00
Ralph Castain
be3dfb6a1a Ensure that we only add -lpmi once to the wrapper compilers, no matter how many components might use it.
This commit was SVN r25753.
2012-01-20 04:56:38 +00:00
Ralph Castain
f5c43e8d60 Get the header files into the tarball
This commit was SVN r25746.
2012-01-19 21:02:20 +00:00
Nathan Hjelm
804a494036 zero out ugni fragments in constructor
This commit was SVN r25731.
2012-01-17 19:52:26 +00:00
Vishwanath Venkatesan
1e95d8b1e2 remove the MPI functions used in these files by the OMPI internal corresponding functionality and also add error checking in these for functions which did not have them'
This commit was SVN r25723.
2012-01-13 17:21:51 +00:00
Rolf vandeVaart
16d676aa5b Fix minor issue with CUDA. Cannot register overlappiing regions.
This commit was SVN r25716.
2012-01-12 13:00:42 +00:00
Samuel Gutierrez
63869c431b init seg_num_procs_inited to zero before the atomic add.
This commit was SVN r25710.
2012-01-11 03:37:23 +00:00
Nathan Hjelm
96c1df8d90 clean up vader registration code
This commit was SVN r25704.
2012-01-10 22:33:22 +00:00
Edgar Gabriel
fb4d1a7099 remove the MPI functions used in this file by the OMPI internal corresponding functionality.
This commit was SVN r25703.
2012-01-10 19:55:05 +00:00
Nathan Hjelm
f65f6f5c39 bugfix: ugni: increase smsg mailbox size to a multiple of 4096
This commit was SVN r25702.
2012-01-10 19:50:25 +00:00
Mike Dubman
37dc53bbc9 mxm: return the MXM_REQ_SEND_SYNC flag to mxm_req_send
This commit was SVN r25694.
2012-01-06 18:56:28 +00:00
Mike Dubman
3b97d609a8 mtl_mxm: fix double free
This commit was SVN r25693.
2012-01-06 16:22:58 +00:00
Samuel Gutierrez
d1a44ecd34 send packed buffers instead of using iovecs in common sm rml. this commit will
hopefully resolve the periodic bus errors that some mtt tests have been
encountering.

This commit was SVN r25692.
2012-01-05 00:11:59 +00:00
Rolf vandeVaart
9441f33981 Improve an error message. Replace tabs with spaces.
This commit was SVN r25688.
2012-01-03 15:19:01 +00:00
Rolf vandeVaart
8073f5002a Some additional CUDA specific code.
Adding a few more support functions that will be used in future development.

This commit was SVN r25684.
2011-12-29 12:31:54 +00:00
Edgar Gabriel
e0139a2d7e provide descriptions about the functionality of these frameworks.
This commit was SVN r25682.
2011-12-22 19:42:00 +00:00
Vishwanath Venkatesan
0f928be1d5 Modifying selection logic back to select two-phase at the cases it should.
This commit was SVN r25681.
2011-12-22 01:01:32 +00:00
Vishwanath Venkatesan
37c8470e3d modified implementation for two-phase write_all incorporating romio style domain partitioning
This commit was SVN r25680.
2011-12-22 00:16:29 +00:00
Vishwanath Venkatesan
738a67b704 Removing duplicate code while setting default file view and using internal file-set-view for setting the default file view
This commit was SVN r25679.
2011-12-21 21:50:47 +00:00
Rolf vandeVaart
6ca186fb64 Delay some initialization until needed. This eliminates some
warnings and removes need for CUDA init before MPI_Init.

This commit was SVN r25678.
2011-12-21 15:21:57 +00:00
Samuel Gutierrez
519f71ab7e silences valgrind warning in common sm (Syscall param writev(vector[...]) points
to uninitialised byte(s)).  probably also silences a large stack allocation
warning in coverity.

This commit was SVN r25666.
2011-12-16 23:17:48 +00:00
Samuel Gutierrez
0ca6603fa0 remove some unused cruft in shmem. minor common sm cleanup.
This commit was SVN r25665.
2011-12-16 22:43:55 +00:00
Nathan Hjelm
71527c8058 minor ugni btl code cleanup
This commit was SVN r25618.
2011-12-10 08:20:46 +00:00
Nathan Hjelm
c8a4687402 don't set SIGSEGV to default
This commit was SVN r25610.
2011-12-09 21:54:05 +00:00
Nathan Hjelm
e03d23d96e Intial support for Cray's uGNI interface (XE-6/XK-6)
This commit was SVN r25608.
2011-12-09 21:24:07 +00:00
Nathan Hjelm
87b7e85d53 rfc timeout. retry registration after removing old registration from lru
This commit was SVN r25587.
2011-12-07 18:20:44 +00:00
Josh Hursey
e56b4de2c9 Fixes trac:2550 : Cleanup comment in crcp_bkmrk_pml.h
This commit was SVN r25585.

The following Trac tickets were found above:
  Ticket 2550 --> https://svn.open-mpi.org/trac/ompi/ticket/2550
2011-12-07 14:50:04 +00:00
Jeff Squyres
c10f41c87e Do not build these frameworks when --disable-mpi-io is specified.
Fixes some Cisco MTT MPI install errors.

This commit was SVN r25566.
2011-12-02 22:11:23 +00:00
Ralph Castain
07655e2945 Handle the case where the allocator "fibs" to us about the node names. In some cases (ahem...you know who you are!), the allocator will tell us a node number (e.g., "16"). However, the daemon will return a node name (e.g., "nid0016") - leaving us not recognizing its location.
So provide a new parameter (can't have too many!) that handles this situation by stripping the prefix from the returned node name. Also do a little cleanup to ensure we cleanly exit from errors, without generating too many annoying messages.

This commit was SVN r25562.
2011-12-02 14:10:08 +00:00
Ralph Castain
357ac14530 Can't return a numerical value here
This commit was SVN r25559.
2011-12-02 10:36:57 +00:00
Nathan Hjelm
bb1fec0407 added put/get btl descriptor flags
This commit was SVN r25553.
2011-11-30 21:37:23 +00:00
Ralph Castain
c56acf60ca Although we never really thought about it, we made an unconscious assumption in the mapper system - we assumed that the daemons would be placed on nodes in the order that the nodes appear in the allocation. In other words, we assumed that the launch environment would map processes in node order.
Turns out, this isn't necessarily true. The Cray, for example, launches processes in a toroidal pattern, thus causing the daemons to wind up somewhere other than what we thought. Other environments (e.g., slurm) are also capable of such behavior, depending upon the default mapping algorithm they are told to use.

Resolve this problem by making the daemon-to-node assignment in the affected environments when the daemon calls back and tells us what node it is on. Order the nodes in the mapping list so they are in daemon-vpid order as opposed to the order in which they show in the allocation. For environments that don't exhibit this mapping behavior (e.g., rsh), this won't have any impact.

Also, clean up the vm launch procedure a little bit so it more closely aligns with the state machine implementation that is coming, and remove some lingering "slave" code.

This commit was SVN r25551.
2011-11-30 19:58:24 +00:00
Jeff Squyres
6fbbfd0f7a Gah! r25545 acidentally included ''waaaay'' more stuff than it was
supposed to.  I.e., half-baked/not complete stuff.

This commit backs out all of r25545.  Sorry folks!

This commit was SVN r25546.

The following SVN revision numbers were found above:
  r25545 --> open-mpi/ompi@7f9ae11faf
2011-11-29 23:24:52 +00:00
Jeff Squyres
7f9ae11faf Per http://www.open-mpi.org/community/lists/users/2011/11/17862.php,
to make MPI_IN_PLACE (and other sentinel Fortran constants) work on OS
X, we need to use the following compiler (linker) flag:

    -Wl,-commons,use_dylibs 

So if we're compiling on OS X, test to see if that flag works with the
compiler.  If so, add it to the wrapper FFLAGS and FCFLAGS (note that
per a future update, we'll only have one Fortran compiler anyway).

Fixes trac:1982.  

This commit was SVN r25545.

The following Trac tickets were found above:
  Ticket 1982 --> https://svn.open-mpi.org/trac/ompi/ticket/1982
2011-11-29 23:05:54 +00:00
Terry Dontje
5209de048c add code to service_thread_start to handle EBADF returns from select. This commit fixes trac:2922.
This commit was SVN r25520.

The following Trac tickets were found above:
  Ticket 2922 --> https://svn.open-mpi.org/trac/ompi/ticket/2922
2011-11-29 16:49:59 +00:00
Samuel Gutierrez
375162c693 this commit fixes a few things. 1. silence warning in common sm. 2. remove unneeded config code in common sm. 3. move opal_shmem_base_close to a better place in opal_finalize. 4. fix opal_path_nfs output.
This commit was SVN r25518.
2011-11-28 23:41:19 +00:00
George Bosilca
0bd2bf9aae The number of segments accepted should be bounded by MCA_BTL_DES_MAX_SEGMENTS
and not by 2.

This commit was SVN r25515.
2011-11-28 17:19:12 +00:00
Nathan Hjelm
f8c8c641f1 added asserts to warn developers that ob1/csum match fragments do not support more than 2 segments
This commit was SVN r25514.
2011-11-28 16:12:25 +00:00
Samuel Gutierrez
b4edf0ff5c getting ready for 1.5 port of the shared memory enhancements. remove some unused/unneeded stuff and minor style update.
This commit was SVN r25513.
2011-11-28 16:08:32 +00:00
Ralph Castain
9b59d8de6f This is actually a much smaller commit than it appears at first glance - it just touches a lot of files. The --without-rte-support configuration option has never really been implemented completely. The option caused various objects not to be defined and conditionally compiled some base functions, but did nothing to prevent build of the component libraries. Unfortunately, since many of those components use objects covered by the option, it caused builds to break if those components were allowed to build.
Brian dealt with this in the past by creating platform files and using "no-build" to block the components. This was clunky, but acceptable when only one organization was using that option. However, that number has now expanded to at least two more locations.

Accordingly, make --without-rte-support actually work by adding appropriate configury to prevent components from building when they shouldn't. While doing so, remove two frameworks (db and rmcast) that are no longer used as ORCM comes to a close (besides, they belonged in ORCM now anyway). Do some minor cleanups along the way.

This commit was SVN r25497.
2011-11-22 21:24:35 +00:00
Ralph Castain
6310361532 At long last, the fabled revision to the affinity system has arrived. A more detailed explanation of how this all works will be presented here:
https://svn.open-mpi.org/trac/ompi/wiki/ProcessPlacement

The wiki page is incomplete at the moment, but I hope to complete it over the next few days. I will provide updates on the devel list. As the wiki page states, the default and most commonly used options remain unchanged (except as noted below). New, esoteric and complex options have been added, but unless you are a true masochist, you are unlikely to use many of them beyond perhaps an initial curiosity-motivated experimentation.

In a nutshell, this commit revamps the map/rank/bind procedure to take into account topology info on the compute nodes. I have, for the most part, preserved the default behaviors, with three notable exceptions:

1. I have at long last bowed my head in submission to the system admin's of managed clusters. For years, they have complained about our default of allowing users to oversubscribe nodes - i.e., to run more processes on a node than allocated slots. Accordingly, I have modified the default behavior: if you are running off of hostfile/dash-host allocated nodes, then the default is to allow oversubscription. If you are running off of RM-allocated nodes, then the default is to NOT allow oversubscription. Flags to override these behaviors are provided, so this only affects the default behavior.

2. both cpus/rank and stride have been removed. The latter was demanded by those who didn't understand the purpose behind it - and I agreed as the users who requested it are no longer using it. The former was removed temporarily pending implementation.

3. vm launch is now the sole method for starting OMPI. It was just too darned hard to maintain multiple launch procedures - maybe someday, provided someone can demonstrate a reason to do so.

As Jeff stated, it is impossible to fully test a change of this size. I have tested it on Linux and Mac, covering all the default and simple options, singletons, and comm_spawn. That said, I'm sure others will find problems, so I'll be watching MTT results until this stabilizes.

This commit was SVN r25476.
2011-11-15 03:40:11 +00:00
Jeff Squyres
e8dcad6017 This typo has been here since August 2005. :-)
This commit was SVN r25468.
2011-11-11 03:01:52 +00:00