1
1
The math for checking the number of QPs and CQs per usNIC/VF was
incorrect, allowing you to run MPI processes even when usNICs (i.e.,
VIC VFs) had fewer QPs and CQs than were necessary.  This led to a
confusing error later when fi_enable(3) failed (because we lazily
create QPs).  Fixing the math here ensure that we actually print a
helpful error message telling the user specifically what is wrong.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Этот коммит содержится в:
Jeff Squyres 2016-04-22 15:55:32 -07:00
родитель b213c58e71
Коммит dc18c32437
2 изменённых файлов: 11 добавлений и 13 удалений

Просмотреть файл

@ -337,11 +337,11 @@ static int check_usnic_config(opal_btl_usnic_module_t *module,
1. num_vfs (i.e., "usNICs") >= num_local_procs (to ensure that
each MPI process will be able to have its own protection
domain), and
2. num_vfs * num_qps_per_vf >= num_local_procs * NUM_CHANNELS
2. num_qps_per_vf >= NUM_CHANNELS
(to ensure that each MPI process will be able to get the
number of QPs it needs -- we know that every VF will have
the same number of QPs), and
3. num_vfs * num_cqs_per_vf >= num_local_procs * NUM_CHANNELS
3. num_cqs_per_vf >= NUM_CHANNELS
(to ensure that each MPI process will be able to get the
number of CQs that it needs) */
if (uip->ui.v1.ui_num_vf < unlp) {
@ -350,19 +350,17 @@ static int check_usnic_config(opal_btl_usnic_module_t *module,
goto error;
}
if (uip->ui.v1.ui_num_vf * uip->ui.v1.ui_qp_per_vf <
unlp * USNIC_NUM_CHANNELS) {
snprintf(str, sizeof(str), "Not enough WQ/RQ (found %d, need %d)",
uip->ui.v1.ui_num_vf * uip->ui.v1.ui_qp_per_vf,
unlp * USNIC_NUM_CHANNELS);
if (uip->ui.v1.ui_qp_per_vf < USNIC_NUM_CHANNELS) {
snprintf(str, sizeof(str), "Not enough transmit/receive queues per usNIC (found %d, need %d)",
uip->ui.v1.ui_qp_per_vf,
USNIC_NUM_CHANNELS);
goto error;
}
if (uip->ui.v1.ui_num_vf * uip->ui.v1.ui_cq_per_vf <
unlp * USNIC_NUM_CHANNELS) {
if (uip->ui.v1.ui_cq_per_vf < USNIC_NUM_CHANNELS) {
snprintf(str, sizeof(str),
"Not enough CQ per usNIC (found %d, need %d)",
uip->ui.v1.ui_num_vf * uip->ui.v1.ui_cq_per_vf,
unlp * USNIC_NUM_CHANNELS);
"Not enough completion queues per usNIC (found %d, need %d)",
uip->ui.v1.ui_cq_per_vf,
USNIC_NUM_CHANNELS);
goto error;
}

Просмотреть файл

@ -18,7 +18,7 @@ This means that you have either not provisioned enough usNICs on this
VIC, or there are not enough total receive, transmit, or completion
queues on the provisioned usNICs. On each VIC in a given server, you
need to provision at least as many usNICs as MPI processes on that
server. In each usNIC, you need to provision at least two each of the
server. In each usNIC, you need to provision enough of each of the
following: send queues, receive queues, and completion queues.
Open MPI will skip this usNIC interface in the usnic BTL, which may