openmpi

Автор	SHA1	Сообщение	Дата
Jeff Squyres	fa10e1ea97	Create a Github issue template So that we can stop asking common questions like "What version of Open MPI are you using?", etc. [skip ci] bot:notest Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2017-04-12 16:18:32 -04:00
Ralph Castain	97e38e6d84	Move a free to a little later in case the verbose output needs it Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-04-11 11:21:12 -07:00
Ralph Castain	bb81f3b5db	Always setup the attach fifo, even when we initially launch under a debugger so that the user can detach and reattach later Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-04-11 08:24:03 -07:00
Nathan Hjelm	bea7d9e4f7	Merge pull request #3320 from hjelmn/osc_pt2pt_fix osc/pt2pt: fix infinite frag allocation loop	2017-04-11 09:09:30 -06:00
Artem Polyakov	4477b87e1d	Merge pull request #3303 from karasevb/timing2/master OMPI timings	2017-04-11 07:52:40 -07:00
Boris Karasev	d132eab4a5	ompi/timings: fixed the error of opal timings env import Signed-off-by: Boris Karasev <karasev.b@gmail.com>	2017-04-11 12:08:48 +06:00
Nathan Hjelm	12b52b2b2c	osc/pt2pt: fix infinite frag allocation loop Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2017-04-10 16:30:47 -06:00
Nathan Hjelm	4146ef9725	Merge pull request #3272 from hjelmn/cma_fix cma: restore --with-cma=no configure option	2017-04-10 12:17:01 -06:00
KAWASHIMA Takahiro	94092fbbab	Merge pull request #3306 from kawashima-fj/pr/darray-accumulate-fix datatype: Fix darray MPI_ACCUMULATE bug	2017-04-10 18:42:35 +09:00
KAWASHIMA Takahiro	b4599d7bb7	datatype: Fix darray MPI_ACCUMULATE bug Array sizes of `array_of_gsizes`, `array_of_distribs`, `array_of_dargs`, and `array_of_psizes` parameters of the `ompi_datatype_create_darray` function (and `MPI_TYPE_CREATE_DARRAY`) are all `ndims`. `ndims` are `i[2]`, not `i[0]`. See MPI-3.1 p.122. Because this function `__ompi_datatype_create_from_args` is used by pt2pt OSC, using a datatype created by `MPI_TYPE_CREATE_DARRAY` for `MPI_(R)(GET_)ACCUMULATE` caused a segmentation fault or something on a target process. Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>	2017-04-10 17:31:59 +09:00
Ralph Castain	95ae0d1df3	Cleanup timing macros for portability across compilers. Rename the --enable-timing configure option to be --enable-pmix-timing so it doesn't pickup external timing requests. Remove a stale function reference in PMIx so it can compile with timing enabled. Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-04-10 12:56:38 +06:00
Howard Pritchard	f5942ff23c	Merge pull request #3304 from hppritcha/topic/de-ortization-of-ompi de-ORTEfy the ompi tree	2017-04-07 14:14:41 -06:00
Howard Pritchard	2b807d06d6	Merge pull request #3300 from bwbarrett/master build: Fix platform detection on FreeBSD	2017-04-07 13:01:49 -06:00
Noah Evans	ef29fb13cb	de-ORTEfy the ompi tree The ompi tree should be runtime independent, but over time a few ORTE depedent definitions and functions have escaped into the ompi tree. I'm working on my own runtime so I've used this as an opportunity to get rid of ORTE dependencies in the ompi/ tree. I still need to go back and change orte to conform to the new world and these changes are untested, but I can now compile (but not link) without orte so I'm commiting this changeset. Signed-off-by: Noah Evans <noah.evans@gmail.com>	2017-04-07 12:35:58 -06:00
Nathaniel Graham	5e44e40ca9	Merge pull request #3293 from nrgraham23/mpirun_help_parsable Add parsable option to help arguments	2017-04-07 11:35:49 -06:00
Boris Karasev	36a0e71f2d	ompi/timings: preparing to production state Adds: - enabling/disabling of timings throught environment variable `OMPI_TIMING_ENABLE` - output format: [file name]:[function name]:[description]: avg/min/max - dynamically extending array of results for case then inited size was exhausted - catch and collect errors - cleanup Note: For use feature need to configure with `--enable-timings` and set env `OMPI_TIMING_ENABLE = 1` Signed-off-by: Boris Karasev <karasev.b@gmail.com>	2017-04-07 21:16:57 +06:00
Artem Polyakov	e3acf2a339	ompi/timings: add OMPI-level timing framework. This is an extension of OPAL timing framework that allows to use MPI_reduce to provide the compact representation of the collected timings throughout the whole application. NOTE: the functionality is disabled now, it will be enabled after the runtime verification. Signed-off-by: Artem Polyakov <artpol84@gmail.com>	2017-04-07 21:16:22 +06:00
Artem Polyakov	45898a9c65	opal/timing: add the draft of env-based timings This commit adds new timing feature that uses environment variables to expose timing information. This allows easy access to this data (if timing is enabled) from any other part of the application for the subsequent postprocessing. In particular this will be integrated with OMPI-level timing framework that whill use MPI_Reduce functionality to provide more compact and easy-to use information. This commit also adds the example of usage of this framework by annotating rte_init function. The result is not used anywhere for now. It will be postprocessed in subsequent commits. NOTE: that functionality is currently disabled untill it will be verified at runtime Signed-off-by: Artem Polyakov <artpol84@gmail.com>	2017-04-07 21:16:22 +06:00
Artem Polyakov	88ed79ea25	opal/timing: remove old framework Signed-off-by: Artem Polyakov <artpol84@gmail.com>	2017-04-07 21:16:22 +06:00
Artem Polyakov	1063c0d567	opal/timing: remove timings from MPI_Init and MPI_Finalize Signed-off-by: Artem Polyakov <artpol84@gmail.com>	2017-04-07 21:16:21 +06:00
Artem Polyakov	482d7c9322	opal/timing: remove RML timings Signed-off-by: Artem Polyakov <artpol84@gmail.com>	2017-04-07 21:16:21 +06:00
Artem Polyakov	79100de014	opal/timing: Remove oob tracing Signed-off-by: Artem Polyakov <artpol84@gmail.com>	2017-04-07 21:16:21 +06:00
Ralph Castain	eba6c6b827	Merge pull request #3301 from rhc54/topic/faults Correctly identify the source of the event when notifying of abnormal termination by a proces	2017-04-06 21:43:34 -07:00
Ralph Castain	b526bca56c	Fix a potential segfault by avoiding NULL topologies prior to launching the VM. Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-04-06 20:51:19 -07:00
Ralph Castain	b33b4607df	Correctly identify the source of the event when notifying of abnormal termination by a process Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-04-06 20:50:38 -07:00
Brian Barrett	ab3ac6d0ea	build: Fix platform detection on FreeBSD Look for amd64 in addition to x86_64 as the platform type for x86_64 assembly. The FreeBSD-packaged Autoconf package has a patch to return amd64-unknown-freebsd11.0 instead of the x86_64-unknown-freebsd11.0 that a stock Autoconf package would return. Since we want to run Jenkins builds on FreeBSD, working around the FreeBSD patch is probably the easiest thing. Signed-off-by: Brian Barrett <bbarrett@amazon.com>	2017-04-06 20:27:22 -07:00
Ralph Castain	666386fc19	Merge pull request #3294 from rhc54/topic/modx Enable SLURM on Cray with constraints and fix bug in nidmap	2017-04-06 09:55:07 -07:00
Ralph Castain	a29ca2bb0d	Enable slurm operations on Cray with constraints Cleanup some errors in the nidmap code that caused us to send unnecessary topologies Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-04-06 08:58:06 -07:00
Brian Barrett	d7f283cbce	README: Update supported platform list Per discussion at last developer's forum, platforms not actively being tested (either in Jenkins or at least weekly in MTT) are not eligible to be listed as supported platforms. Move a number of systems out of the supported list. Signed-off-by: Brian Barrett <bbarrett@amazon.com>	2017-04-05 17:25:01 -07:00
Nathaniel Graham	36d660e07a	Add parsable option to help arguments This commit adds a "parsable" option to the help arguments, which prints out a machine readable list of all the mpirun options. Fixes #3279 Signed-off-by: Nathaniel Graham <ngraham@lanl.gov>	2017-04-05 17:01:43 -06:00
Ralph Castain	bf668ad1e9	Merge pull request #3287 from rhc54/topic/ht Provide further (hopefully) helpful messages about the hotel size	2017-04-05 07:26:03 -07:00
Ralph Castain	db8943cedd	Provide further (hopefully) helpful messages about the hotel size Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-04-05 04:27:32 -07:00
Ralph Castain	840d6c9a1d	Merge pull request #3284 from rhc54/topic/hotel Resolve the direct modex race condition.	2017-04-05 03:09:27 -07:00
Gilles Gouaillardet	8d1369db49	Merge pull request #3283 from ggouaillardet/topic/nvml build nvml support only with CUDA support	2017-04-05 16:14:23 +09:00
Ralph Castain	b7e9711f45	Resolve the direct modex race condition. The request hotel was running out of rooms, thereby returning an error upon checkin - and we had missed error_logging a couple of those places. Hence no error message and things just hung. Output a (hopefully) helpful message when we timeout an operation Thanks to Nathan for tracking it down. Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-04-04 21:32:44 -07:00
Ralph Castain	9a69b20d09	Merge pull request #3282 from rhc54/topic/direct Set the PARENT vpid for direct routed module	2017-04-04 20:55:12 -07:00
Gilles Gouaillardet	10ea991d0a	hwloc: add CUDA include dir to CPPFLAGS so hwloc configury can find nvml.h when CUDA support is built Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-04-05 11:46:22 +09:00
Gilles Gouaillardet	8d7541f766	hwloc: disable nvml is CUDA support is not built in Open MPI Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-04-05 11:07:34 +09:00
Ralph Castain	9132bb26fe	Merge pull request #3281 from rhc54/topic/dmx Adjust the timeout for direct modex requests to reflect the size of t…	2017-04-04 19:04:33 -07:00
Ralph Castain	40ca43e157	Set the PARENT vpid for direct routed module Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-04-04 19:03:28 -07:00
Ralph Castain	734b90aa6b	Adjust the timeout for direct modex requests to reflect the size of the job. It can take several seconds to start all the procs, and we don't want to timeout due to differences in start times of the various procs Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-04-04 18:20:51 -07:00
Ralph Castain	9cb18b8348	Merge pull request #3280 from rhc54/topic/dvm Fix the DVM by ensuring that all nodes, even those that didn't partic…	2017-04-04 18:15:33 -07:00
Ralph Castain	74863a0ea4	Fix the DVM by ensuring that all nodes, even those that didn't participate (i.e., didn't have any local children) in a job, clean up all resources associated with that job upon its completion. With the advent of backend distributed mapping, nodes that weren't part of the job would still allocate resources on other nodes - and then start from that point when mapping the next job. This change ensures that all daemons start from the same point each time. Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-04-04 17:31:38 -07:00
Aurelien Bouteiller	308d33dca7	Merge pull request #3266 from abouteiller/bugfix/f90mpiext Fix the top_ompi_srcdir in Fortran mpiext building system	2017-04-04 16:25:30 -04:00
Nathaniel Graham	7063f3021f	Merge pull request #3231 from nrgraham23/revamp_mpirun_help mpirun --help output revamp	2017-04-04 12:32:20 -06:00
Nathaniel Graham	19e5d15491	mpirun --help output revamp This commit modifies the output from the mpirun --help command. The options have been split into groups, to make the output smaller and more readable. The groups are: general, debug, output, input, mapping, ranking, binding, devel, compatibility, launch, dvm, and unsupported. There is also a special "full" command that can be used to get the old behaviour of printing out all of the options. Unsupported options may only be seen with this full output. This commit also adds a special case for the help argument. It makes it possible for the user to enter 0 or 1 arguments instead of having to always enter an argument. This defaults to printing out the "general" help options so the user can then see what help arguments there are. Signed-off-by: Nathaniel Graham <ngraham@lanl.gov>	2017-04-04 10:59:32 -06:00
Ralph Castain	a605bd4265	Merge pull request #3278 from rhc54/topic/tm Remove stale code line	2017-04-04 09:04:52 -07:00
Ralph Castain	393c4536eb	Remove stale code line Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-04-04 08:13:15 -07:00
Nathan Hjelm	1322e5dee8	Merge pull request #3274 from hjelmn/osc_rdma_fix osc/rdma: fix typo in atomic code	2017-04-04 00:20:42 -06:00
Gilles Gouaillardet	5dfd4ab6ca	coll/tuned: remove set-but-not-used variables Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-04-04 13:18:11 +09:00

1 2 3 4 5 ...

26922 Коммитов