5b8616d890
was quite subtle, and only happened on the process with the smallest guid (as this process will tear down the connection created locally and replace it with the result of accept). If multiple threads are active in the system, the deadlock occurs during the recv event deletion as one thread will hold the recv event lock of the endpoint and try to access the TCP event base lock, while the other thread will hold the TCP event base lock while trying to access the recv event lock (in case data is available on the socket). The proposed solution let the event callback fail to process the data, preventing the deadlock and allowing the other thread to always complete it's job. As the event is not execute the same triggered will trigger again at the next opportunity, so this solution introduce a minimal delay in the connection establishement. |
||
---|---|---|
.. | ||
btl_tcp_addr.h | ||
btl_tcp_component.c | ||
btl_tcp_endpoint.c | ||
btl_tcp_endpoint.h | ||
btl_tcp_frag.c | ||
btl_tcp_frag.h | ||
btl_tcp_ft.c | ||
btl_tcp_ft.h | ||
btl_tcp_hdr.h | ||
btl_tcp_proc.c | ||
btl_tcp_proc.h | ||
btl_tcp.c | ||
btl_tcp.h | ||
configure.m4 | ||
help-mpi-btl-tcp.txt | ||
Makefile.am |