aliBuild O2 fails on Ubuntu 18.04 after python3 / ofi updated in alidist


(Ruben Shahoyan) #1

Hi

I get multiple problems with aliBuild build O2 --defaults o2 on Ubuntu 18.04 after the recent changes in alidist (dependence on ofi and python3)

  1. The aliDoctor O2 --defaults o2 was showing:
...
ERROR: brew() { true; }; pkg-config --atleast-version=1.6.0 libfabric 2>&1 && printf "#include \"rdma/fabric.h\"\nint main(){}" | gcc -xc - -o /dev/null
ERROR: with the following output:
ERROR: 
ERROR: ofi: 
ERROR: libfabric and its development package are missing from your system.
ERROR:  * RHEL-compatible systems: you will probably need "libfabric" and "libfabric-devel" packages.
ERROR:  * Ubuntu-compatible systems: you will probably need "libfabric-bin" and "libfabric-dev".
ERROR: 

I have the libfabric-dev instaled, but the highest version ubuntu 18.04 provides is 1.5.3-1 (the 1.6 comes with ubuntu 18.10).

I had to change the version from 1.6.0 to 1.5.0 in the ofi.sh to overcome this. Is it indeed critical to have the version of the libfabric >=1.6.0 ?

  1. failing ==> Building Python-modules@1.0 due to the
...
+ env PYTHONUSERBASE=/home/shahoian/alice/sw/INSTALLROOT/8e1ed7866cc8b74e275a2371b177d526f12d1abc/ubuntu1804_x86-64/Python-modules/1.0-1 pip3 install --user -IU --no-warn-script-location -r requirements.txt

Usage:   
  pip install [options] <requirement specifier> [package-index-options] ...
  pip install [options] -r <requirements file> [package-index-options] ...
  pip install [options] [-e] <vcs project url> ...
  pip install [options] [-e] <local project path> ...
  pip install [options] <archive url/path> ...

no such option: --no-warn-script-location

Now, after removing this --no-warn-script-location option from the python-modules.sh it fails in Python-modules with

FATAL: problems importing the following Python modules
* seaborn
* sklearn_evaluation

I’ve uploaded to the cernbox the Python-modules-latest_log and the result of aliDoctor O2 --defaults o2``

Cheers,
Ruben


(Giulio Eulisse) #2

I think you are missing the python3-tk package on your ubuntu.


(Ruben Shahoyan) #3

Thanks, yes, I’ve already figured this out (why it is not set as a dependency?) and the python-modules stage is now passed. But now the asiofi building fails due to the wrong version of libfabric: it needs 1.6

shahoian@alicers02:~/alice/sw/SOURCES/asiofi/v0.3.1/v0.3.1$ git grep '1.6.0'
CMakeLists.txt:  find_package2(PUBLIC OFI VERSION 1.6.0 REQUIRED

so my cheating of ofi.sh did not pass… Trying to compile libfabric 1.7 from source.


(Giulio Eulisse) #4

I would simply drop asiofi from fairmq. As far as I know we do not use it for anything but data distribution. We should probably ask WP5 and WP11 to split the dependency from ALFA and move it to datadistribution itself…


(Ruben Shahoyan) #5

Ok, but on the fairmq.sh recipe level only this does not work:

CMake Error at cmake/FairMQLib.cmake:316 (find_package):
  By not providing "Findasiofi.cmake" in CMAKE_MODULE_PATH this project has
  asked CMake to find a package configuration file provided by "asiofi", but
  CMake did not find one.
...

@deklein, any suggestion?


(Giulio Eulisse) #6

Try with: https://github.com/alisw/alidist/pull/1597


(Dennis Klein) #7

@shahoian Sry, I somehow missed the notification and didn’t see this ticket.

This is the intended behaviour (although the cmake error message is not the nicest indeed, but we rely on CMake standard facilities here). All build switches of FairMQ of type -DBUILD_<SOMETHING> are simple and will fail fast, if requirements are not met. The solution here is as @eulisse proposed, “do not set this flag, if you do not want this feature”.

I partly answered this in https://github.com/alisw/alidist/pull/1603. To sum up, it seems to me that supporting ofi 1.5.x should be straightforward, I have opened an issue with asiofi (https://github.com/FairRootGroup/asiofi/issues/4) and will test it at the next occasion.

In parallel, I recommend to only build the ofi transport in environments, you really need it.


(Ruben Shahoyan) #8

@deklein
It seems the problem is solved by building latest ofi version by the alidist, you probably don’ t need to go back to ofi 1.5 . As for the decision to build ofi or not, I guess this should be handled on defaults level (I was using --defaults o2).

Cheers,
Ruben