Roc-bench-dma crash

Dear FLP support,
can you help with following:
[flp@cru-server2 aj]$ roc-bench-dma --i=#0 --data=Fee --fast --bypass --to-file-bin=/tmp/data --bytes=200Mi

Error: Couldn’t lock Memory Mapped File; /jenkins/workspace/BuildRPM/sw/20156213/1/SOURCES/ReadoutCard/v0.28.0/v0.28.0/include/ReadoutCard/InterprocessLock.h(72): Throw in function AliceO2::roc::Interprocess::lock::Lock(const string&, bool)
Dynamic exception type: boost::wrapexceptstd::runtime_error
std::exception::what: Couldn’t bind to socket Alice_O2_RoC_MMF_/var/lib/hugetlbfs/global/pagesize-1GB/roc-bench-dma_id=#0_chan=0_pages_lock

Cheers,

Hello Anton, most of our experts are on holidays this week…

The message Couldn’t bind to socket could mean that the socket is already bound, meaning some process is already listening to it.

Did you try to clean all leftover processes? Or, last chance, to reboot the machine?

You could also try to use lsof and search for the string Alice_O2_RoC_MMF which would point to the “guilty” process (or possibly use netstat -a but here I would not know well what to search).

Thank you.
lsof did not find Alice_O2_RoC_MMF . After reboot I see the same:
[flp@cru-server2 ~]$ aj/ctpro.sh dumpf
2021-04-07 15:06:19.046647 DMA channel: 0
2021-04-07 15:06:19.047917 IOMMU enabled
Error: Couldn’t lock Memory Mapped File; /jenkins/workspace/BuildRPM/sw/20156213/1/SOURCES/ReadoutCard/v0.28.0/v0.28.0/include/ReadoutCard/InterprocessLock.h(72): Throw in function AliceO2::roc::Interprocess::lock::Lock(const string&, bool)
Dynamic exception type: boost::wrapexceptstd::runtime_error
std::exception::what: Couldn’t bind to socket Alice_O2_RoC_MMF_/var/lib/hugetlbfs/global/pagesize-1GB/roc-bench-dma_id=#0_chan=0_pages_lock

[Error message] = Couldn’t lock Memory Mapped File; /jenkins/workspace/BuildRPM/sw/20156213/1/SOURCES/ReadoutCard/v0.28.0/v0.28.0/include/ReadoutCard/InterprocessLock.h(72): Throw in function AliceO2::roc::Interprocess::lock::Lock(const string&, bool)
Dynamic exception type: boost::wrapexceptstd::runtime_error
std::exception::what: Couldn’t bind to socket Alice_O2_RoC_MMF_/var/lib/hugetlbfs/global/pagesize-1GB/roc-bench-dma_id=#0_chan=0_pages_lock

Cheers, Anton

That’s not good…

Could you please run this command:

cat /etc/o2.d/FLP_suite_version

[flp@cru-server2 ~]$ cat /etc/o2.d/FLP_suite_version
0.14.0

That’s a pretty old version. Can you please update to 0.16.0?

Can be done when Pippo is back, I do not have admin passwd. Thank you, Cheers.

Aha, OK. I can do the update if you want…

I had a phone call with Anton. As the machine is pretty old, it has reportedly problems on reboot, and both CRU experts are not available, I prefer to hold the (eventual) update until next week when the experts will be back.

Hello,

Indeed the error reported means that the socket lock is bound by another process.

I just tried running bench-dma and it went through, so another process was running or some leftover resources were not cleaned up.

I agree with Roberto the server should anyway be updated.

Cheers,
Kostas