2013-07-19 22:13:58 +00:00
|
|
|
/*
|
2014-02-26 07:47:33 +00:00
|
|
|
* Copyright (c) 2013-2014 Cisco Systems, Inc. All rights reserved.
|
2013-07-19 22:13:58 +00:00
|
|
|
* $COPYRIGHT$
|
|
|
|
*
|
|
|
|
* Additional copyrights may follow
|
|
|
|
*
|
|
|
|
* $HEADER$
|
|
|
|
*/
|
|
|
|
|
George did the work and deserves all the credit for it. Ralph did the merge, and deserves whatever blame results from errors in it :-)
WHAT: Open our low-level communication infrastructure by moving all necessary components (btl/rcache/allocator/mpool) down in OPAL
All the components required for inter-process communications are currently deeply integrated in the OMPI layer. Several groups/institutions have express interest in having a more generic communication infrastructure, without all the OMPI layer dependencies. This communication layer should be made available at a different software level, available to all layers in the Open MPI software stack. As an example, our ORTE layer could replace the current OOB and instead use the BTL directly, gaining access to more reactive network interfaces than TCP. Similarly, external software libraries could take advantage of our highly optimized AM (active message) communication layer for their own purpose. UTK with support from Sandia, developped a version of Open MPI where the entire communication infrastucture has been moved down to OPAL (btl/rcache/allocator/mpool). Most of the moved components have been updated to match the new schema, with few exceptions (mainly BTLs where I have no way of compiling/testing them). Thus, the completion of this RFC is tied to being able to completing this move for all BTLs. For this we need help from the rest of the Open MPI community, especially those supporting some of the BTLs. A non-exhaustive list of BTLs that qualify here is: mx, portals4, scif, udapl, ugni, usnic.
This commit was SVN r32317.
2014-07-26 00:47:28 +00:00
|
|
|
#include "opal_config.h"
|
2013-07-19 22:13:58 +00:00
|
|
|
|
|
|
|
#include <stdio.h>
|
|
|
|
#include <unistd.h>
|
|
|
|
#include <infiniband/verbs.h>
|
|
|
|
|
2013-07-22 17:28:23 +00:00
|
|
|
#include "opal/util/show_help.h"
|
George did the work and deserves all the credit for it. Ralph did the merge, and deserves whatever blame results from errors in it :-)
WHAT: Open our low-level communication infrastructure by moving all necessary components (btl/rcache/allocator/mpool) down in OPAL
All the components required for inter-process communications are currently deeply integrated in the OMPI layer. Several groups/institutions have express interest in having a more generic communication infrastructure, without all the OMPI layer dependencies. This communication layer should be made available at a different software level, available to all layers in the Open MPI software stack. As an example, our ORTE layer could replace the current OOB and instead use the BTL directly, gaining access to more reactive network interfaces than TCP. Similarly, external software libraries could take advantage of our highly optimized AM (active message) communication layer for their own purpose. UTK with support from Sandia, developped a version of Open MPI where the entire communication infrastucture has been moved down to OPAL (btl/rcache/allocator/mpool). Most of the moved components have been updated to match the new schema, with few exceptions (mainly BTLs where I have no way of compiling/testing them). Thus, the completion of this RFC is tied to being able to completing this move for all BTLs. For this we need help from the rest of the Open MPI community, especially those supporting some of the BTLs. A non-exhaustive list of BTLs that qualify here is: mx, portals4, scif, udapl, ugni, usnic.
This commit was SVN r32317.
2014-07-26 00:47:28 +00:00
|
|
|
#include "opal/constants.h"
|
|
|
|
#include "opal/util/if.h"
|
2013-07-19 22:13:58 +00:00
|
|
|
|
|
|
|
#include "btl_usnic_util.h"
|
|
|
|
|
|
|
|
|
2014-07-30 20:52:06 +00:00
|
|
|
void opal_btl_usnic_exit(opal_btl_usnic_module_t *module)
|
2013-07-19 22:13:58 +00:00
|
|
|
{
|
2014-07-30 20:52:06 +00:00
|
|
|
if (NULL == module) {
|
|
|
|
/* Find the first module with an error callback */
|
|
|
|
for (uint32_t i = 0; i < mca_btl_usnic_component.num_modules; ++i) {
|
|
|
|
if (NULL != mca_btl_usnic_component.usnic_active_modules[i]->pml_error_callback) {
|
|
|
|
module = mca_btl_usnic_component.usnic_active_modules[i];
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
/* If we didn't find a PML error callback, just exit. */
|
|
|
|
if (NULL == module) {
|
|
|
|
exit(1);
|
|
|
|
}
|
|
|
|
}
|
2013-07-19 22:13:58 +00:00
|
|
|
|
2014-07-30 20:52:06 +00:00
|
|
|
/* After discussion with George, we decided that it was safe to
|
|
|
|
cast away the const from opal_proc_local_get() -- the error
|
|
|
|
function needs to be smart enough to not take certain actions
|
|
|
|
if the passed proc is yourself (e.g., don't call del_procs() on
|
|
|
|
yourself). */
|
|
|
|
if (NULL != module->pml_error_callback) {
|
|
|
|
module->pml_error_callback(&module->super,
|
|
|
|
MCA_BTL_ERROR_FLAGS_FATAL,
|
|
|
|
(opal_proc_t*) opal_proc_local_get(),
|
|
|
|
"usnic");
|
2013-07-19 22:13:58 +00:00
|
|
|
}
|
2014-07-30 20:52:06 +00:00
|
|
|
|
|
|
|
/* If the PML error callback returns (or if there wasn't one),
|
|
|
|
just exit. Shrug. */
|
|
|
|
exit(1);
|
2013-07-19 22:13:58 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
void
|
George did the work and deserves all the credit for it. Ralph did the merge, and deserves whatever blame results from errors in it :-)
WHAT: Open our low-level communication infrastructure by moving all necessary components (btl/rcache/allocator/mpool) down in OPAL
All the components required for inter-process communications are currently deeply integrated in the OMPI layer. Several groups/institutions have express interest in having a more generic communication infrastructure, without all the OMPI layer dependencies. This communication layer should be made available at a different software level, available to all layers in the Open MPI software stack. As an example, our ORTE layer could replace the current OOB and instead use the BTL directly, gaining access to more reactive network interfaces than TCP. Similarly, external software libraries could take advantage of our highly optimized AM (active message) communication layer for their own purpose. UTK with support from Sandia, developped a version of Open MPI where the entire communication infrastucture has been moved down to OPAL (btl/rcache/allocator/mpool). Most of the moved components have been updated to match the new schema, with few exceptions (mainly BTLs where I have no way of compiling/testing them). Thus, the completion of this RFC is tied to being able to completing this move for all BTLs. For this we need help from the rest of the Open MPI community, especially those supporting some of the BTLs. A non-exhaustive list of BTLs that qualify here is: mx, portals4, scif, udapl, ugni, usnic.
This commit was SVN r32317.
2014-07-26 00:47:28 +00:00
|
|
|
opal_btl_usnic_dump_hex(uint8_t *addr, int len)
|
2013-07-19 22:13:58 +00:00
|
|
|
{
|
|
|
|
char buf[128];
|
|
|
|
size_t bufspace;
|
|
|
|
int i, ret;
|
|
|
|
char *p;
|
|
|
|
uint32_t sum=0;
|
|
|
|
|
|
|
|
p = buf;
|
|
|
|
memset(buf, 0, sizeof(buf));
|
|
|
|
bufspace = sizeof(buf) - 1;
|
|
|
|
|
|
|
|
for (i=0; i<len; ++i) {
|
|
|
|
ret = snprintf(p, bufspace, "%02x ", addr[i]);
|
|
|
|
p += ret;
|
|
|
|
bufspace -= ret;
|
|
|
|
|
|
|
|
sum += addr[i];
|
|
|
|
if ((i&15) == 15) {
|
|
|
|
opal_output(0, "%4x: %s\n", i&~15, buf);
|
|
|
|
|
|
|
|
p = buf;
|
|
|
|
memset(buf, 0, sizeof(buf));
|
|
|
|
bufspace = sizeof(buf) - 1;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
if ((i&15) != 0) {
|
|
|
|
opal_output(0, "%4x: %s\n", i&~15, buf);
|
|
|
|
}
|
|
|
|
/*opal_output(0, "buffer sum = %x\n", sum); */
|
|
|
|
}
|
|
|
|
|
|
|
|
|
2014-02-26 22:21:25 +00:00
|
|
|
/*
|
|
|
|
* Trivial wrapper around snprintf'ing an IPv4 address, with or
|
|
|
|
* without a CIDR mask (we don't usually carry around addresses in
|
|
|
|
* struct sockaddr form, so this wrapper is marginally easier than
|
|
|
|
* using inet_ntop()).
|
|
|
|
*/
|
George did the work and deserves all the credit for it. Ralph did the merge, and deserves whatever blame results from errors in it :-)
WHAT: Open our low-level communication infrastructure by moving all necessary components (btl/rcache/allocator/mpool) down in OPAL
All the components required for inter-process communications are currently deeply integrated in the OMPI layer. Several groups/institutions have express interest in having a more generic communication infrastructure, without all the OMPI layer dependencies. This communication layer should be made available at a different software level, available to all layers in the Open MPI software stack. As an example, our ORTE layer could replace the current OOB and instead use the BTL directly, gaining access to more reactive network interfaces than TCP. Similarly, external software libraries could take advantage of our highly optimized AM (active message) communication layer for their own purpose. UTK with support from Sandia, developped a version of Open MPI where the entire communication infrastucture has been moved down to OPAL (btl/rcache/allocator/mpool). Most of the moved components have been updated to match the new schema, with few exceptions (mainly BTLs where I have no way of compiling/testing them). Thus, the completion of this RFC is tied to being able to completing this move for all BTLs. For this we need help from the rest of the Open MPI community, especially those supporting some of the BTLs. A non-exhaustive list of BTLs that qualify here is: mx, portals4, scif, udapl, ugni, usnic.
This commit was SVN r32317.
2014-07-26 00:47:28 +00:00
|
|
|
void opal_btl_usnic_snprintf_ipv4_addr(char *out, size_t maxlen,
|
2014-02-26 22:21:25 +00:00
|
|
|
uint32_t addr, uint32_t cidrmask)
|
|
|
|
{
|
|
|
|
uint8_t *p = (uint8_t*) &addr;
|
|
|
|
if (cidrmask > 0) {
|
|
|
|
snprintf(out, maxlen, "%u.%u.%u.%u/%u",
|
|
|
|
p[0],
|
|
|
|
p[1],
|
|
|
|
p[2],
|
|
|
|
p[3],
|
|
|
|
cidrmask);
|
|
|
|
} else {
|
|
|
|
snprintf(out, maxlen, "%u.%u.%u.%u",
|
|
|
|
p[0],
|
|
|
|
p[1],
|
|
|
|
p[2],
|
|
|
|
p[3]);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
|
George did the work and deserves all the credit for it. Ralph did the merge, and deserves whatever blame results from errors in it :-)
WHAT: Open our low-level communication infrastructure by moving all necessary components (btl/rcache/allocator/mpool) down in OPAL
All the components required for inter-process communications are currently deeply integrated in the OMPI layer. Several groups/institutions have express interest in having a more generic communication infrastructure, without all the OMPI layer dependencies. This communication layer should be made available at a different software level, available to all layers in the Open MPI software stack. As an example, our ORTE layer could replace the current OOB and instead use the BTL directly, gaining access to more reactive network interfaces than TCP. Similarly, external software libraries could take advantage of our highly optimized AM (active message) communication layer for their own purpose. UTK with support from Sandia, developped a version of Open MPI where the entire communication infrastucture has been moved down to OPAL (btl/rcache/allocator/mpool). Most of the moved components have been updated to match the new schema, with few exceptions (mainly BTLs where I have no way of compiling/testing them). Thus, the completion of this RFC is tied to being able to completing this move for all BTLs. For this we need help from the rest of the Open MPI community, especially those supporting some of the BTLs. A non-exhaustive list of BTLs that qualify here is: mx, portals4, scif, udapl, ugni, usnic.
This commit was SVN r32317.
2014-07-26 00:47:28 +00:00
|
|
|
void opal_btl_usnic_sprintf_mac(char *out, const uint8_t mac[6])
|
2013-07-19 22:13:58 +00:00
|
|
|
{
|
2014-07-30 20:56:15 +00:00
|
|
|
snprintf(out, 32, "%02x:%02x:%02x:%02x:%02x:%02x",
|
2013-07-19 22:13:58 +00:00
|
|
|
mac[0],
|
|
|
|
mac[1],
|
|
|
|
mac[2],
|
|
|
|
mac[3],
|
|
|
|
mac[4],
|
|
|
|
mac[5]);
|
|
|
|
}
|
|
|
|
|
|
|
|
|
George did the work and deserves all the credit for it. Ralph did the merge, and deserves whatever blame results from errors in it :-)
WHAT: Open our low-level communication infrastructure by moving all necessary components (btl/rcache/allocator/mpool) down in OPAL
All the components required for inter-process communications are currently deeply integrated in the OMPI layer. Several groups/institutions have express interest in having a more generic communication infrastructure, without all the OMPI layer dependencies. This communication layer should be made available at a different software level, available to all layers in the Open MPI software stack. As an example, our ORTE layer could replace the current OOB and instead use the BTL directly, gaining access to more reactive network interfaces than TCP. Similarly, external software libraries could take advantage of our highly optimized AM (active message) communication layer for their own purpose. UTK with support from Sandia, developped a version of Open MPI where the entire communication infrastucture has been moved down to OPAL (btl/rcache/allocator/mpool). Most of the moved components have been updated to match the new schema, with few exceptions (mainly BTLs where I have no way of compiling/testing them). Thus, the completion of this RFC is tied to being able to completing this move for all BTLs. For this we need help from the rest of the Open MPI community, especially those supporting some of the BTLs. A non-exhaustive list of BTLs that qualify here is: mx, portals4, scif, udapl, ugni, usnic.
This commit was SVN r32317.
2014-07-26 00:47:28 +00:00
|
|
|
void opal_btl_usnic_sprintf_gid_mac(char *out, union ibv_gid *gid)
|
2013-07-19 22:13:58 +00:00
|
|
|
{
|
|
|
|
uint8_t mac[6];
|
George did the work and deserves all the credit for it. Ralph did the merge, and deserves whatever blame results from errors in it :-)
WHAT: Open our low-level communication infrastructure by moving all necessary components (btl/rcache/allocator/mpool) down in OPAL
All the components required for inter-process communications are currently deeply integrated in the OMPI layer. Several groups/institutions have express interest in having a more generic communication infrastructure, without all the OMPI layer dependencies. This communication layer should be made available at a different software level, available to all layers in the Open MPI software stack. As an example, our ORTE layer could replace the current OOB and instead use the BTL directly, gaining access to more reactive network interfaces than TCP. Similarly, external software libraries could take advantage of our highly optimized AM (active message) communication layer for their own purpose. UTK with support from Sandia, developped a version of Open MPI where the entire communication infrastucture has been moved down to OPAL (btl/rcache/allocator/mpool). Most of the moved components have been updated to match the new schema, with few exceptions (mainly BTLs where I have no way of compiling/testing them). Thus, the completion of this RFC is tied to being able to completing this move for all BTLs. For this we need help from the rest of the Open MPI community, especially those supporting some of the BTLs. A non-exhaustive list of BTLs that qualify here is: mx, portals4, scif, udapl, ugni, usnic.
This commit was SVN r32317.
2014-07-26 00:47:28 +00:00
|
|
|
opal_btl_usnic_gid_to_mac(gid, mac);
|
|
|
|
opal_btl_usnic_sprintf_mac(out, mac);
|
2013-07-19 22:13:58 +00:00
|
|
|
}
|
|
|
|
|
2013-10-23 15:51:11 +00:00
|
|
|
/* Pretty-print the given boolean array as a hexadecimal string. slen should
|
|
|
|
* include space for any null terminator. */
|
George did the work and deserves all the credit for it. Ralph did the merge, and deserves whatever blame results from errors in it :-)
WHAT: Open our low-level communication infrastructure by moving all necessary components (btl/rcache/allocator/mpool) down in OPAL
All the components required for inter-process communications are currently deeply integrated in the OMPI layer. Several groups/institutions have express interest in having a more generic communication infrastructure, without all the OMPI layer dependencies. This communication layer should be made available at a different software level, available to all layers in the Open MPI software stack. As an example, our ORTE layer could replace the current OOB and instead use the BTL directly, gaining access to more reactive network interfaces than TCP. Similarly, external software libraries could take advantage of our highly optimized AM (active message) communication layer for their own purpose. UTK with support from Sandia, developped a version of Open MPI where the entire communication infrastucture has been moved down to OPAL (btl/rcache/allocator/mpool). Most of the moved components have been updated to match the new schema, with few exceptions (mainly BTLs where I have no way of compiling/testing them). Thus, the completion of this RFC is tied to being able to completing this move for all BTLs. For this we need help from the rest of the Open MPI community, especially those supporting some of the BTLs. A non-exhaustive list of BTLs that qualify here is: mx, portals4, scif, udapl, ugni, usnic.
This commit was SVN r32317.
2014-07-26 00:47:28 +00:00
|
|
|
void opal_btl_usnic_snprintf_bool_array(char *s, size_t slen, bool a[], size_t alen)
|
2013-10-23 15:51:11 +00:00
|
|
|
{
|
|
|
|
size_t i = 0;
|
|
|
|
size_t j = 0;
|
|
|
|
|
|
|
|
/* could accommodate other cases, but not needed right now */
|
|
|
|
assert(slen % 4 == 0);
|
|
|
|
|
|
|
|
/* compute one nybble at a time */
|
|
|
|
while (i < alen && (j < slen - 1)) {
|
|
|
|
unsigned char tmp = 0;
|
|
|
|
|
|
|
|
/* first bool is the leftmost (most significant) bit of the nybble */
|
|
|
|
tmp |= !!a[i+0] << 3;
|
|
|
|
tmp |= !!a[i+1] << 2;
|
|
|
|
tmp |= !!a[i+2] << 1;
|
|
|
|
tmp |= !!a[i+3] << 0;
|
|
|
|
tmp += '0';
|
|
|
|
s[j] = tmp;
|
|
|
|
|
|
|
|
++j;
|
|
|
|
i += 4;
|
|
|
|
}
|
|
|
|
|
|
|
|
s[j++] = '\0';
|
|
|
|
assert(i <= alen);
|
|
|
|
assert(j <= slen);
|
|
|
|
}
|
|
|
|
|
2013-07-19 22:13:58 +00:00
|
|
|
|
George did the work and deserves all the credit for it. Ralph did the merge, and deserves whatever blame results from errors in it :-)
WHAT: Open our low-level communication infrastructure by moving all necessary components (btl/rcache/allocator/mpool) down in OPAL
All the components required for inter-process communications are currently deeply integrated in the OMPI layer. Several groups/institutions have express interest in having a more generic communication infrastructure, without all the OMPI layer dependencies. This communication layer should be made available at a different software level, available to all layers in the Open MPI software stack. As an example, our ORTE layer could replace the current OOB and instead use the BTL directly, gaining access to more reactive network interfaces than TCP. Similarly, external software libraries could take advantage of our highly optimized AM (active message) communication layer for their own purpose. UTK with support from Sandia, developped a version of Open MPI where the entire communication infrastucture has been moved down to OPAL (btl/rcache/allocator/mpool). Most of the moved components have been updated to match the new schema, with few exceptions (mainly BTLs where I have no way of compiling/testing them). Thus, the completion of this RFC is tied to being able to completing this move for all BTLs. For this we need help from the rest of the Open MPI community, especially those supporting some of the BTLs. A non-exhaustive list of BTLs that qualify here is: mx, portals4, scif, udapl, ugni, usnic.
This commit was SVN r32317.
2014-07-26 00:47:28 +00:00
|
|
|
int opal_btl_usnic_find_ip(opal_btl_usnic_module_t *module, uint8_t mac[6])
|
2013-07-19 22:13:58 +00:00
|
|
|
{
|
|
|
|
int i;
|
|
|
|
uint8_t localmac[6];
|
|
|
|
char addr_string[32], mac_string[32];
|
|
|
|
struct sockaddr sa;
|
|
|
|
struct sockaddr_in *sai;
|
|
|
|
|
|
|
|
/* Loop through all IP interfaces looking for the one with the
|
|
|
|
right MAC */
|
|
|
|
for (i = opal_ifbegin(); i != -1; i = opal_ifnext(i)) {
|
|
|
|
if (OPAL_SUCCESS == opal_ifindextomac(i, localmac)) {
|
|
|
|
|
|
|
|
/* Is this the MAC I'm looking for? */
|
|
|
|
if (0 != memcmp(mac, localmac, 6)) {
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Yes, it is! */
|
2014-07-30 20:56:15 +00:00
|
|
|
if (OPAL_SUCCESS != opal_ifindextoname(i, module->if_name,
|
2013-07-19 22:13:58 +00:00
|
|
|
sizeof(module->if_name)) ||
|
|
|
|
OPAL_SUCCESS != opal_ifindextoaddr(i, &sa, sizeof(sa)) ||
|
|
|
|
OPAL_SUCCESS != opal_ifindextomask(i, &module->if_cidrmask,
|
|
|
|
sizeof(module->if_cidrmask)) ||
|
|
|
|
OPAL_SUCCESS != opal_ifindextomac(i, module->if_mac) ||
|
|
|
|
OPAL_SUCCESS != opal_ifindextomtu(i, &module->if_mtu)) {
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
|
|
|
sai = (struct sockaddr_in *) &sa;
|
|
|
|
memcpy(&module->if_ipv4_addr, &sai->sin_addr, 4);
|
|
|
|
|
|
|
|
/* Save this information to my local address field on the
|
|
|
|
module so that it gets sent in the modex */
|
|
|
|
module->local_addr.ipv4_addr = module->if_ipv4_addr;
|
|
|
|
module->local_addr.cidrmask = module->if_cidrmask;
|
|
|
|
|
|
|
|
/* Since verbs doesn't offer a way to get standard
|
|
|
|
Ethernet MTUs (as of libibverbs 1.1.5, the MTUs are
|
|
|
|
enums, and don't inlcude values for 1500 or 9000), look
|
2013-07-24 16:06:28 +00:00
|
|
|
up the MTU in the corresponding enic interface. */
|
2013-07-19 22:13:58 +00:00
|
|
|
module->local_addr.mtu = module->if_mtu;
|
|
|
|
|
|
|
|
inet_ntop(AF_INET, &(module->if_ipv4_addr),
|
|
|
|
addr_string, sizeof(addr_string));
|
George did the work and deserves all the credit for it. Ralph did the merge, and deserves whatever blame results from errors in it :-)
WHAT: Open our low-level communication infrastructure by moving all necessary components (btl/rcache/allocator/mpool) down in OPAL
All the components required for inter-process communications are currently deeply integrated in the OMPI layer. Several groups/institutions have express interest in having a more generic communication infrastructure, without all the OMPI layer dependencies. This communication layer should be made available at a different software level, available to all layers in the Open MPI software stack. As an example, our ORTE layer could replace the current OOB and instead use the BTL directly, gaining access to more reactive network interfaces than TCP. Similarly, external software libraries could take advantage of our highly optimized AM (active message) communication layer for their own purpose. UTK with support from Sandia, developped a version of Open MPI where the entire communication infrastucture has been moved down to OPAL (btl/rcache/allocator/mpool). Most of the moved components have been updated to match the new schema, with few exceptions (mainly BTLs where I have no way of compiling/testing them). Thus, the completion of this RFC is tied to being able to completing this move for all BTLs. For this we need help from the rest of the Open MPI community, especially those supporting some of the BTLs. A non-exhaustive list of BTLs that qualify here is: mx, portals4, scif, udapl, ugni, usnic.
This commit was SVN r32317.
2014-07-26 00:47:28 +00:00
|
|
|
opal_btl_usnic_sprintf_mac(mac_string, mac);
|
2014-06-24 18:13:49 +00:00
|
|
|
opal_output_verbose(5, USNIC_OUT,
|
2013-07-19 22:13:58 +00:00
|
|
|
"btl:usnic: found usNIC device corresponds to IP device %s, %s/%d, MAC %s",
|
|
|
|
module->if_name, addr_string, module->if_cidrmask,
|
|
|
|
mac_string);
|
George did the work and deserves all the credit for it. Ralph did the merge, and deserves whatever blame results from errors in it :-)
WHAT: Open our low-level communication infrastructure by moving all necessary components (btl/rcache/allocator/mpool) down in OPAL
All the components required for inter-process communications are currently deeply integrated in the OMPI layer. Several groups/institutions have express interest in having a more generic communication infrastructure, without all the OMPI layer dependencies. This communication layer should be made available at a different software level, available to all layers in the Open MPI software stack. As an example, our ORTE layer could replace the current OOB and instead use the BTL directly, gaining access to more reactive network interfaces than TCP. Similarly, external software libraries could take advantage of our highly optimized AM (active message) communication layer for their own purpose. UTK with support from Sandia, developped a version of Open MPI where the entire communication infrastucture has been moved down to OPAL (btl/rcache/allocator/mpool). Most of the moved components have been updated to match the new schema, with few exceptions (mainly BTLs where I have no way of compiling/testing them). Thus, the completion of this RFC is tied to being able to completing this move for all BTLs. For this we need help from the rest of the Open MPI community, especially those supporting some of the BTLs. A non-exhaustive list of BTLs that qualify here is: mx, portals4, scif, udapl, ugni, usnic.
This commit was SVN r32317.
2014-07-26 00:47:28 +00:00
|
|
|
return OPAL_SUCCESS;
|
2013-07-19 22:13:58 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
George did the work and deserves all the credit for it. Ralph did the merge, and deserves whatever blame results from errors in it :-)
WHAT: Open our low-level communication infrastructure by moving all necessary components (btl/rcache/allocator/mpool) down in OPAL
All the components required for inter-process communications are currently deeply integrated in the OMPI layer. Several groups/institutions have express interest in having a more generic communication infrastructure, without all the OMPI layer dependencies. This communication layer should be made available at a different software level, available to all layers in the Open MPI software stack. As an example, our ORTE layer could replace the current OOB and instead use the BTL directly, gaining access to more reactive network interfaces than TCP. Similarly, external software libraries could take advantage of our highly optimized AM (active message) communication layer for their own purpose. UTK with support from Sandia, developped a version of Open MPI where the entire communication infrastucture has been moved down to OPAL (btl/rcache/allocator/mpool). Most of the moved components have been updated to match the new schema, with few exceptions (mainly BTLs where I have no way of compiling/testing them). Thus, the completion of this RFC is tied to being able to completing this move for all BTLs. For this we need help from the rest of the Open MPI community, especially those supporting some of the BTLs. A non-exhaustive list of BTLs that qualify here is: mx, portals4, scif, udapl, ugni, usnic.
This commit was SVN r32317.
2014-07-26 00:47:28 +00:00
|
|
|
return OPAL_ERR_NOT_FOUND;
|
2013-07-19 22:13:58 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Reverses the encoding done in usnic_main.c:usnic_mac_to_gid() in
|
|
|
|
* the usnic.ko kernel code.
|
|
|
|
*
|
|
|
|
* Got this scheme from Mellanox RoCE; Emulex did the same thing. So
|
|
|
|
* we followed convention.
|
|
|
|
* http://www.mellanox.com/related-docs/prod_software/RoCE_with_Priority_Flow_Control_Application_Guide.pdf
|
|
|
|
*/
|
George did the work and deserves all the credit for it. Ralph did the merge, and deserves whatever blame results from errors in it :-)
WHAT: Open our low-level communication infrastructure by moving all necessary components (btl/rcache/allocator/mpool) down in OPAL
All the components required for inter-process communications are currently deeply integrated in the OMPI layer. Several groups/institutions have express interest in having a more generic communication infrastructure, without all the OMPI layer dependencies. This communication layer should be made available at a different software level, available to all layers in the Open MPI software stack. As an example, our ORTE layer could replace the current OOB and instead use the BTL directly, gaining access to more reactive network interfaces than TCP. Similarly, external software libraries could take advantage of our highly optimized AM (active message) communication layer for their own purpose. UTK with support from Sandia, developped a version of Open MPI where the entire communication infrastucture has been moved down to OPAL (btl/rcache/allocator/mpool). Most of the moved components have been updated to match the new schema, with few exceptions (mainly BTLs where I have no way of compiling/testing them). Thus, the completion of this RFC is tied to being able to completing this move for all BTLs. For this we need help from the rest of the Open MPI community, especially those supporting some of the BTLs. A non-exhaustive list of BTLs that qualify here is: mx, portals4, scif, udapl, ugni, usnic.
This commit was SVN r32317.
2014-07-26 00:47:28 +00:00
|
|
|
void opal_btl_usnic_gid_to_mac(union ibv_gid *gid, uint8_t mac[6])
|
2013-07-19 22:13:58 +00:00
|
|
|
{
|
|
|
|
mac[0] = gid->raw[8] ^ 2;
|
|
|
|
mac[1] = gid->raw[9];
|
|
|
|
mac[2] = gid->raw[10];
|
|
|
|
mac[3] = gid->raw[13];
|
|
|
|
mac[4] = gid->raw[14];
|
|
|
|
mac[5] = gid->raw[15];
|
|
|
|
}
|
|
|
|
|
|
|
|
/* takes an IPv4 address in network byte order and a CIDR prefix length (the
|
|
|
|
* "X" in "a.b.c.d/X") and returns the subnet in network byte order. */
|
George did the work and deserves all the credit for it. Ralph did the merge, and deserves whatever blame results from errors in it :-)
WHAT: Open our low-level communication infrastructure by moving all necessary components (btl/rcache/allocator/mpool) down in OPAL
All the components required for inter-process communications are currently deeply integrated in the OMPI layer. Several groups/institutions have express interest in having a more generic communication infrastructure, without all the OMPI layer dependencies. This communication layer should be made available at a different software level, available to all layers in the Open MPI software stack. As an example, our ORTE layer could replace the current OOB and instead use the BTL directly, gaining access to more reactive network interfaces than TCP. Similarly, external software libraries could take advantage of our highly optimized AM (active message) communication layer for their own purpose. UTK with support from Sandia, developped a version of Open MPI where the entire communication infrastucture has been moved down to OPAL (btl/rcache/allocator/mpool). Most of the moved components have been updated to match the new schema, with few exceptions (mainly BTLs where I have no way of compiling/testing them). Thus, the completion of this RFC is tied to being able to completing this move for all BTLs. For this we need help from the rest of the Open MPI community, especially those supporting some of the BTLs. A non-exhaustive list of BTLs that qualify here is: mx, portals4, scif, udapl, ugni, usnic.
This commit was SVN r32317.
2014-07-26 00:47:28 +00:00
|
|
|
uint32_t opal_btl_usnic_get_ipv4_subnet(uint32_t addrn, uint32_t cidr_len)
|
2013-07-19 22:13:58 +00:00
|
|
|
{
|
|
|
|
uint32_t mask;
|
|
|
|
|
|
|
|
assert(cidr_len <= 32);
|
|
|
|
|
|
|
|
/* perform arithmetic in host byte order for shift correctness */
|
|
|
|
mask = (~0) << (32 - cidr_len);
|
|
|
|
return htonl(ntohl(addrn) & mask);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Simple utility in a .c file, mainly so that inline functions in .h
|
2013-07-22 17:28:23 +00:00
|
|
|
* files don't need to include RTE header files.
|
2013-07-19 22:13:58 +00:00
|
|
|
*/
|
2014-07-30 20:52:06 +00:00
|
|
|
void opal_btl_usnic_util_abort(const char *msg, const char *file, int line)
|
2013-07-19 22:13:58 +00:00
|
|
|
{
|
2013-07-22 17:28:23 +00:00
|
|
|
opal_show_help("help-mpi-btl-usnic.txt", "internal error after init",
|
2013-07-19 22:13:58 +00:00
|
|
|
true,
|
2014-07-26 21:48:23 +00:00
|
|
|
opal_process_info.nodename,
|
2014-07-30 20:52:06 +00:00
|
|
|
msg, file, line);
|
2013-07-19 22:13:58 +00:00
|
|
|
|
2014-07-30 20:52:06 +00:00
|
|
|
opal_btl_usnic_exit(NULL);
|
2013-07-19 22:13:58 +00:00
|
|
|
/* Never returns */
|
|
|
|
}
|
2013-11-04 22:52:03 +00:00
|
|
|
|
|
|
|
|
|
|
|
/* Return the largest size data size that can be packed into max_len using the
|
|
|
|
* given convertor. For example, a 1000 byte max_len buffer may only be able
|
|
|
|
* to hold 998 bytes if an indivisible convertor element straddles the 1000
|
|
|
|
* byte boundary.
|
|
|
|
*
|
|
|
|
* This routine internally clones the convertor and does not mutate it!
|
|
|
|
*/
|
George did the work and deserves all the credit for it. Ralph did the merge, and deserves whatever blame results from errors in it :-)
WHAT: Open our low-level communication infrastructure by moving all necessary components (btl/rcache/allocator/mpool) down in OPAL
All the components required for inter-process communications are currently deeply integrated in the OMPI layer. Several groups/institutions have express interest in having a more generic communication infrastructure, without all the OMPI layer dependencies. This communication layer should be made available at a different software level, available to all layers in the Open MPI software stack. As an example, our ORTE layer could replace the current OOB and instead use the BTL directly, gaining access to more reactive network interfaces than TCP. Similarly, external software libraries could take advantage of our highly optimized AM (active message) communication layer for their own purpose. UTK with support from Sandia, developped a version of Open MPI where the entire communication infrastucture has been moved down to OPAL (btl/rcache/allocator/mpool). Most of the moved components have been updated to match the new schema, with few exceptions (mainly BTLs where I have no way of compiling/testing them). Thus, the completion of this RFC is tied to being able to completing this move for all BTLs. For this we need help from the rest of the Open MPI community, especially those supporting some of the BTLs. A non-exhaustive list of BTLs that qualify here is: mx, portals4, scif, udapl, ugni, usnic.
This commit was SVN r32317.
2014-07-26 00:47:28 +00:00
|
|
|
size_t opal_btl_usnic_convertor_pack_peek(
|
2013-11-04 22:52:03 +00:00
|
|
|
const opal_convertor_t *conv,
|
|
|
|
size_t max_len)
|
|
|
|
{
|
|
|
|
int rc;
|
|
|
|
size_t packable_len, position;
|
|
|
|
opal_convertor_t temp;
|
|
|
|
|
|
|
|
OBJ_CONSTRUCT(&temp, opal_convertor_t);
|
|
|
|
position = conv->bConverted + max_len;
|
|
|
|
rc = opal_convertor_clone_with_position(conv, &temp, 1, &position);
|
|
|
|
if (OPAL_UNLIKELY(rc < 0)) {
|
|
|
|
BTL_ERROR(("unexpected convertor error"));
|
|
|
|
abort(); /* XXX */
|
|
|
|
}
|
|
|
|
assert(position >= conv->bConverted);
|
|
|
|
packable_len = position - conv->bConverted;
|
|
|
|
OBJ_DESTRUCT(&temp);
|
|
|
|
return packable_len;
|
|
|
|
}
|