openmpi

Автор	SHA1	Сообщение	Дата
Aurelien Bouteiller	4df5fcf48c	errors_are_fatal_comm_handler takes a pointer to the error constant as input. Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>	2020-08-26 16:05:30 -04:00
Aurelien Bouteiller	ee149fcfcb	MPI3 (unchanged in 4) says that errors after MPI_REQUEST_FREE are FATAL Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>	2020-07-31 17:49:38 -04:00
Aurelien Bouteiller	bec7dfc1b1	Errors in non-api calls remain fatal Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>	2020-07-31 17:49:35 -04:00
Aurélien Bouteiller	b37202c74e	Add compliance mode with MPI-4 routing of errors to MPI_COMM_SELF by default And other streamlining of aborting behavior. Signed-off-by: Aurélien Bouteiller <bouteill@icl.utk.edu> Remove OMPI_COMM_ERRORS and use NOHANDLE macros instead. Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu> route unbound errors to self error handler Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu> Do not raise the error handler from within components Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>	2020-07-23 05:09:29 -04:00
bosilca	1139d9ecae	Merge pull request #7931 from bosilca/fix/7928 Fix the BTL API conversion for the SMCUDA BTL	2020-07-18 17:35:39 -04:00
George Bosilca	96e8cbe25f	First step on fixing the BTL API conversion for the SMCUDA BTL Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2020-07-13 14:46:10 -04:00
Joshua Ladd	aa8f7f4ede	Merge pull request #7893 from bureddy/cuda-ucx UCX: initialize cuda from ucx pml component	2020-07-13 14:18:48 -04:00
Devendar Bureddy	2547e24c55	UCX: initialize cuda from ucx pml component Signed-off-by: Devendar Bureddy <devendar@mellanox.com>	2020-07-12 18:41:40 +03:00
Nathan Hjelm	88f51fbb8e	btl: change argument type of BTL receive callbacks This commit updates the btl interface to change the parameters passed to receive callbacks. The interface used to pass the tag, a btl base descriptor, and the callback context. Most of the values in the btl base descriptor were unused and only helped simplify the callbacks from the self btl. All of the arguments have now been replaced with a single receive callback descriptor. This descriptor contains the incoming endpoint, data segment(s), tag, and callback context. All btls have been updated to use the new callback and the btl interface version has been bumped to v3.2.0. As part of this change the descriptor argument (and the segments contained within it) have been marked as const. The were treated as const before but this change could allow the compiler to make better optimization decisions and will enforce that the callback does not attempt to change the data in the descriptor. Signed-off-by: Nathan Hjelm <hjelmn@google.com>	2020-07-08 07:38:46 -07:00
Sergey Oblomov	75bda25ddb	OPAL/UCX: enabling new API provided by UCX - added detection of new API into configuration - added tag_send call implemented using new API - added MPI_Send/MPI_Isend/MPI_Recv/MPI_Irecv implementations Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>	2020-05-01 17:58:29 +03:00
Dipti Kothari	5418cc56dd	mca/pml: PML check for direct modex For direct modex, all procs publish the selected pml module and then at add_procs pml module for each proc is checked against every other proc in the add_proc call. For full modex, there is no change in functionality. Only Rank0 publishes its selected pml, all other procs in the add_proc call check their selected pml against Rank0. If pml's do not match, throw error and exit. Signed-off-by: Dipti Kothari <dkothar@amazon.com>	2020-04-22 16:25:01 +00:00
Yossi Itigin	5dcd1f4e6c	Merge pull request #7575 from yosefe/topic/pml-ucx-fix-usage-of-mca-pml pml/ucx: Fix usage of mca_pml_base_pml_check_selected()	2020-03-30 20:06:12 +03:00
Nathan Hjelm	160ff188b8	Merge pull request #7169 from hjelmn/fix_what_wg21_calls_our_problem_not_theirs_seriously__in_some_ways_they_are_correct_but_wtf configure: use -iquote for non-system include paths	2020-03-30 09:22:54 -07:00
Yossi Itigin	124f0c0d1f	pml/ucx: Fix usage of mca_pml_base_pml_check_selected() Pass the correct ompi_proc_t and array length to mca_pml_base_pml_check_selected() during dynamic modex. Signed-off-by: Yossi Itigin <yosefe@mellanox.com>	2020-03-29 17:46:45 +03:00
Howard Pritchard	f136a20cae	Merge pull request #6578 from hppritcha/topic/thread_framework2 Implement a MCA framework for threads	2020-03-27 15:55:48 -06:00
Noah Evans	ee3517427e	Add threads framework Add a framework to support different types of threading models including user space thread packages such as Qthreads and argobot: https://github.com/pmodels/argobots https://github.com/Qthreads/qthreads The default threading model is pthreads. Alternate thread models are specificed at configure time using the --with-threads=X option. The framework is static. The theading model to use is selected at Open MPI configure/build time. mca/threads: implement Argobots threading layer config: fix thread configury - Add double quotations - Change Argobot to Argobots config: implement Argobots check If the poll time is too long, MPI hangs. This quick fix just sets it to 0, but it is not good for the Pthreads version. Need to find a good way to abstract it. Note that even 1 (= 1 millisecond) causes disastrous performance degradation. rework threads MCA framework configury It now works more like the ompi/mca/rte configury, modulo some edge items that are special for threading package linking, etc. qthreads module some argobots cleanup Signed-off-by: Noah Evans <noah.evans@gmail.com> Signed-off-by: Shintaro Iwasaki <siwasaki@anl.gov> Signed-off-by: Howard Pritchard <howardp@lanl.gov>	2020-03-27 10:15:45 -06:00
Ralph Castain	33ab928e1b	ompi_proc_t size reduction: part 1 We currently save the hostname of a proc when we create the ompi_proc_t for it. This was originally done because the only method we had for discovering the host of a proc was to include that info in the modex, and we had to therefore store it somewhere proc-local. Obviously, this ccarried a memory penalty for storing all those strings, and so we added a "cutoff" parameter so that we wouldn't collect hostnames above a certain number of procs. Unfortunately, this still results in an 8-byte/proc memory cost as we have a char* pointer in the opal_proc_t that is contained in the ompi_proc_t so that we can store the hostname of the other procs if we fall below the cutoff. At scale, this can consume a fair amount of memory. With the switch to relying on PMIx, there is no longer a need to cache the proc hostnames. Using the "optional" feature of PMIx_Get, we restrict the retrieval to be purely proc-local - i.e., we retrieve the info either via shared memory or from within the proc-internal hash storage (depending upon the active PMIx components). Thus, the retrieval of a hostname is purely a local operation involving no communication. All RM's are required to provide a complete hostname map of all procs at startup. Thus, we have full access to all hostnames without including them in a modex or having to cache them on each proc. This allows us to remove the char* pointer from the opal_proc_t, saving us 8-bytes/proc. Unfortunately, PMIx_Get does not currently support the return of a static pointer to memory. Thus, even though PMIx has the hostname in its memory, it can only return a malloc'd version of it. I have therefore ensured that the return from opal_get_proc_hostname is consistently malloc'd and free'd wherever used. This shouldn't be a burden as the hostname is only used in one of two circumstances: (a) in an error message (b) in a verbose output for debugging purposes Thus, there should be no performance penalty associated with the malloc/free requirement. PMIx will eventually be returning static pointers, and so we can eventually simplify this method and return a "const char*" - but as noted, this really isn't an issue even today. Signed-off-by: Ralph Castain <rhc@pmix.org>	2020-03-23 12:49:44 -07:00
Gilles Gouaillardet	69bc2e8372	misc: fix <> vs "" includes throught the ompi codebase This commit fixes an issue with the include usage in some ompi source files. These source files are using the <> form of include when the "" form is correct (as these are internal, not system headers). Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp> Signed-off-by: Nathan Hjelm <hjelmn@google.com>	2020-03-09 21:13:49 -04:00
Austen Lauria	04a3a28a74	Some memchecker cleanup and others. - Port memchecker call from a1d502c. - Remove unused memcheck macro variables. - Some code readability improvements. - Remove some stray +1's in dynamic comm cleanup. - Re-add OPAL_ENABLE_DEBUG macro to osc header. - Cleanup some printf's, and includes. - Refactor cleanup of dpm_disconnect_objs. Signed-off-by: Austen Lauria <awlauria@us.ibm.com>	2020-03-05 16:44:18 -05:00
Gilles Gouaillardet	e2ad184db5	pml/ob1: silence valgrind errors always define and initialize padding in various structs when OPAL_ENABLE_DEBUG is set Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2020-03-05 16:10:43 -05:00
Harumi Kuno	397cc44aa4	Fix typo in mca_pml_base_pml_check_selected Addresses issue https://github.com/open-mpi/ompi/issues/7494 Signed-off-by: Harumi Kuno <harumi.kuno@hpe.com>	2020-03-02 22:49:40 -08:00
Ralph Castain	4fe9ae329c	Add missing include and remove stale PML The "yalla" pml no longer exists Signed-off-by: Ralph Castain <rhc@pmix.org>	2020-02-29 11:54:38 -08:00
Ralph Castain	cbbe67eff9	Merge pull request #7487 from bosilca/topic/pml_from_vpid0 Make sure the PML selection is consistent across the world.	2020-02-28 17:19:26 -08:00
bosilca	c4d36859ec	Merge pull request #7228 from devreal/progress-returns Harmonize return values of progress callbacks	2020-02-28 20:15:37 -05:00
bosilca	806b35157d	Merge pull request #7261 from bosilca/fix/vprotocol Fix/vprotocol initialization	2020-02-28 20:14:30 -05:00
George Bosilca	21d743393f	Make sure the PML is consistent across the world. Temporary solution for the PML inconsistency issue discussed in #7475. This patch address 2 things: first it make the PMIx key optional so that if we are not in a full modex mode we don't do a direct modex, and second it get the PML info from the vpid 0 instead of from the local rank. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2020-02-28 17:53:48 -05:00
Gilles Gouaillardet	174e967dbc	Remove ORTE project Will be replaced by PRRTE. Ensure that OMPI and OPAL layers build without reference to ORTE. Setup opal/pmix framework to be static. Remove support for all PMI-1 and PMI-2 libraries. Add support for "external" pmix component as well as internal v4 one. remove orte: misc fixes - UCX fixes - VPATH issue - oshmem fixes - remove useless definition - Add PRRTE submodule - Get autogen.pl to traverse PRRTE submodule - Remove stale orcm reference - Configure embedded PRRTE - Correctly pass the prefix to PRRTE - Correctly set the OMPI_WANT_PRRTE am_conditional - Move prrte configuration to the end of OMPI's configure.ac - Make mpirun a symlink to prun, when available - Fix makedist with --no-orte/--no-prrte option - Add a `--no-prrte` option which is the same as the legacy `--no-orte` option. - Remove embedded PMIx tarball. Replace it with new submodule pointing to OpenPMIx master repo's master branch - Some cleanup in PRRTE integration and add config summary entry - Correctly set the hostname - Fix locality - Fix singleton operations - Fix support for "tune" and "am" options Signed-off-by: Ralph Castain <rhc@pmix.org> Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp> Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>	2020-02-07 18:20:06 -08:00
Joseph Schuchart	2c97187ee0	Harmonize return values of progress callbacks Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>	2020-01-28 20:15:03 +01:00
Austen Lauria	10f6a77640	Merge pull request #7315 from abouteiller/export/tcp_errors_v2 Handle error cases in TCP BTL (v2)	2020-01-27 17:03:07 -05:00
Aurelien Bouteiller	395fcc4253	Disable inband PML error reporting during MPI Finalize as it interferes with the Finalize process. Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>	2020-01-27 13:32:34 -05:00
Austen Lauria	4f6978466d	Merge pull request #7284 from bosilca/fix/monitoring_registration Minor cleanup in the monitoring PML.	2020-01-27 13:01:30 -05:00
Charles Shereda	cbc6feaab2	Created opal_gethostname() as safer gethostname substitute. The opal_gethostname() function provides a more robust mechanism to retrieve the hostname than gethostname(), which can return results that are not null-terminated, and which can vary in its behavior from system to system. opal_gethostname() just returns the value in opal_process_info.nodename; this is populated in opal_init_gethostname() inside opal_init.c. -Changed all gethostname calls in opal subtree to opal_gethostname -Changed all gethostname calls in orte subtree to opal_gethostname -Changed all gethostname calls in ompi subdir to opal_gethostname -Changed all gethostname calls in oshmem subdir to opal_gethostname -Changed opal_if.c in test subdir to use opal_gethostname -Changed opal_init.c to include opal_init_gethostname. This function returns an int and directly sets opal_process_info.nodename per jsquyres' modifications. Relates to open-mpi#6801 Signed-off-by: Charles Shereda <cpshereda@lanl.gov>	2020-01-13 08:52:17 -08:00
George Bosilca	05093f9cb1	Minor cleanup in the monitoring PML. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2020-01-13 09:24:00 -05:00
George Bosilca	271deed68e	Fix the vprotocol initialization. Add a comment about the stages in the vprotocols initialization and why we cant use the threading provided during the PML initialization. Instead, use the OMPI internal threading status to prevent the use of message logging in multi-threaded applications. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2020-01-07 18:06:45 -05:00
George Bosilca	684b91a1bb	We can only specify one single PML as MCA params. Make sure the MCA parameter for the PML selection only contains a single value. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2020-01-07 18:06:45 -05:00
Barton Chittenden	47816ef83b	Remove mxm, yalla and ikrit Signed-off-by: Barton Chittenden <bartonski@gmail.com>	2019-11-22 13:40:16 -08:00
Mark Allen	6855ebb84b	Adding -mca comm_method to print table of communication methods This is closely related to Platform-MPI's old -prot feature. The long-format of the tables it prints could look like this: > Host 0 [myhost001] ranks 0 - 1 > Host 1 [myhost002] ranks 2 - 3 > Host 2 [myhost003] ranks 4 > Host 3 [myhost004] ranks 5 > Host 4 [myhost005] ranks 6 > Host 5 [myhost006] ranks 7 > Host 6 [myhost007] ranks 8 > Host 7 [myhost008] ranks 9 > Host 8 [myhost009] ranks 10 > > host \| 0 1 2 3 4 5 6 7 8 > ======\|============================================== > 0 : sm tcp tcp tcp tcp tcp tcp tcp tcp > 1 : tcp sm tcp tcp tcp tcp tcp tcp tcp > 2 : tcp tcp self tcp tcp tcp tcp tcp tcp > 3 : tcp tcp tcp self tcp tcp tcp tcp tcp > 4 : tcp tcp tcp tcp self tcp tcp tcp tcp > 5 : tcp tcp tcp tcp tcp self tcp tcp tcp > 6 : tcp tcp tcp tcp tcp tcp self tcp tcp > 7 : tcp tcp tcp tcp tcp tcp tcp self tcp > 8 : tcp tcp tcp tcp tcp tcp tcp tcp self > > Connection summary: > on-host: all connections are sm or self > off-host: all connections are tcp In this example hosts 0 and 1 had multiple ranks so "sm" was more meaningful than "self" to identify how the ranks on the host are talking to each other. While host 2..8 were one rank per host so "self" was more meaningful as their btl. Above a certain number of hosts (12 by default) the above table gets too big so we shrink to a more abbreviated looking table that has the same data: > host \| 0 1 2 3 4 8 > ======\|==================== > 0 : A C C C C C C C C > 1 : C A C C C C C C C > 2 : C C B C C C C C C > 3 : C C C B C C C C C > 4 : C C C C B C C C C > 5 : C C C C C B C C C > 6 : C C C C C C B C C > 7 : C C C C C C C B C > 8 : C C C C C C C C B > key: A == sm > key: B == self > key: C == tcp Then above 36 hosts we stop printing the 2d table entirely and just print the summary: > Connection summary: > on-host: all connections are sm or self > off-host: all connections are tcp The options to control it are -mca comm_method 1 : print the above table at the end of MPI_Init -mca comm_method 2 : print the above table at the beginning of MPI_Finalize -mca comm_method_max <n> : number of hosts <n> for which to print a full size 2d -mca comm_method_brief 1 : only print summary output, no 2d table -mca comm_method_fakefile <filename> : for debugging only * printing at init vs finalize: The most important difference between these two is that when printing the table during MPI_Init(), we send extra messages to make sure all hosts are connected to each other. So the table ends up working against the idea of on-demand connections (although it's only forcing the n^2 connections in the number of hosts, not the total ranks). If printing at MPI_Finalize() we don't create any connections that aren't already connected, so the table is more likely to have "n/a" entries if some hosts never connected to each other. * how many hosts <n> for which to print a full size 2d table The option -mca comm_method_max <n> can be used to specify a number of hosts <n> (default 12) that controls at what host-count the unabbreviated / abbreviated 2d tables get printed: 1 - n : full size 2d table n+1 - 3n : shortened 2d table 3n+1 - inf : summary only, no 2d table * brief The option -mca comm_method_brief 1 can be used to skip the printing of the 2d table and only show the short summary * fakefile This is a debugging option that allows easeir testing of all the printout routines by letting all the detected communication methods between the hosts be overridden by fake data from a file. The source of the information used in the table is the .mca_component_name In the case of BTLs, the module always had a .btl_component linking back to the component. The vars mca_pml_base_selected_component and ompi_mtl_base_selected_component offer similar functionality for pml/mtl. So with the ability to identify the component, we can then access the component name with code like this mca_pml_base_selected_component.pmlm_version.mca_component_name See the three lookup_{pml,mtl,btl}_name() functions in hook_comm_method_fns.c, and their use in comm_method() to parse the strings and produce an integer to represent the connection type being used. Signed-off-by: Mark Allen <markalle@us.ibm.com>	2019-10-31 16:23:57 -04:00
Gilles Gouaillardet	33361aa124	pml/ucx: correctly handle zero size datatypes zero-size derived datatypes are now flagged as OPAL_DATATYPE_FLAG_CONTIGUOUS so update mca_pml_ucx_init_datatype() to correctly handle them. Since 'size' is a 'size_t', the assertion can simply be removed. Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2019-10-09 16:54:00 +09:00
Sergey Oblomov	43186e494b	UCX: added PPN hint for UCX context - added PPN hint for UCX context init Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>	2019-08-05 18:07:06 +03:00
Nysal Jan K.A	fe4ef147f8	pml/ucx: Fix the max tag and context id values Signed-off-by: Nysal Jan K.A <jnysal@in.ibm.com>	2019-07-03 14:33:01 +05:30
Yossi Itigin	8535dd570b	Merge pull request #6732 from dmitrygladkov/topic/pml/ucx_init PML/UCX: Don't destroy UCP worker if it wasn't created	2019-06-06 10:41:33 +03:00
Dmitry Gladkov	c864ca51d2	PML/UCX: Don't destroy UCP worker if it wasn't created Signed-off-by: Dmitry Gladkov <dmitrygla@mellanox.com>	2019-06-03 10:49:36 +03:00
Sergey Oblomov	a3578d9ece	PML/UCX: disable PML UCX if MT is requested but not supported - in case if multithreading requested but not supported disable PML UCX Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>	2019-05-17 11:25:23 +03:00
George Bosilca	a16cf0e4dd	Fix the leak of fragments for persistent sends. The rdma_frag attached to the send request was not correctly released upon request completion, leaking until MPI_Finalize. A quick solution would have been to add RDMA_FRAG_RETURN at different locations on the send request completion, but it would have unnecessarily made the sendreq completion path more complex. Instead, I added the length to the RDMA fragment so that it can be completed during the remote ack. Be more explicit on the comment. The rdma_frag can only be freed once when the peer forced a protocol change (from RDMA GET to send/recv). Otherwise the fragment will be returned once all data pertaining to it has been trasnferred. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2019-05-02 09:40:11 -04:00
bosilca	399b7133ab	Merge pull request #6556 from EmmanuelBRELLE/PR_fix_local_handle_in_PUT_message pml/ob1: fixed local handle sent during PUT control message	2019-04-27 13:51:22 -04:00
Jeff Squyres	9a9d106296	Merge pull request #6555 from EmmanuelBRELLE/PR-pmlob1_fix_rc_for_putfrag_when_get_failed pml/ob1: fixed exit from get_frag_fail when falling back on btl_put	2019-04-22 17:19:12 -04:00
Brelle Emmanuel	e630046a4b	pml/ob1: fixed local handle sent during PUT control message In case of using a btl_put in ob1, the handle of the locally registered memory is sent with a PUT control message. In the current master code the sent handle is necessary the handle in the frag but if the handle has been successfully registered in the request, the frag structure does not have any valid handle and all fragments use the request one. I suggest to check if the handle in the fragment is valid and if not to send the handle from the request. Signed-off-by: Brelle Emmanuel <emmanuel.brelle@atos.net>	2019-04-01 18:45:05 +02:00
Brelle Emmanuel	9c689f2225	pml/ob1: fixed exit from get_frag_fail when falling back on btl_put In the case the btl_get fails Ob1 tries to fallback on btl_put first but the return code was ignored. So the code fell back on both btl_put and btl_send. Signed-off-by: Brelle Emmanuel <emmanuel.brelle@atos.net>	2019-04-01 18:17:10 +02:00
George Bosilca	6ea0c4eab9	Prevent a segfault when accessing a rank outside a communicator. This is not fixing any issue, it is simply preventing a sefault if the communicator creation has not happened as expected. Thus, this code path should never really be hit in a correct MPI application with a valid communicator creation support. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2019-03-28 12:03:29 -04:00
Sergey Oblomov	d8e3562bae	PML/SPML/UCX: added evaluation of mmap events - there was a set of UCX related issues reported which caused by mmap API hooks conflicts. We added diagnostic of such problems to simplify bug-resolving pipeline Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>	2019-03-12 21:14:27 +02:00

1 2 3 4 5 ...

1433 Коммитов