Very slow CMake O2 installation on macOS

At the moment, the installation process on macOS for O2 is painfully slow: a simple make install, or ninja install, issued in the build directory when nothing has changed in the source code takes around 6 minutes, whereas on my Linux node it takes three seconds.

Apparently, the install target, when generated by CMake, cannot run in parallel (so that -j whatever has no effect), and quite some time is spent on relocating the binaries using install_name_tool -delete_rpath ...: the net result is that the binaries (libraries, executables) in the build directory always differ from the ones in the installation directory, and CMake thinks it’s a good idea to re-relocate and re-copy them every bloody time.

It was (quite correctly, IMHO) pointed out that we have quite a number of binaries and it has to be reduced, but this issue still stands and it has an impact on macOS development.

I’d like it to be fixed possibly without writing a custom install() command…

I am wondering if some CMake guru like @deklein has some suggestion here :wink:

I have looked into the issue and believe I found even two related bugs, which I have filed upstream: https://gitlab.kitware.com/cmake/cmake/issues/17995. I have no good idea yet for a workaround. Maybe we get an answer soon.

~Dennis

1 Like

Thanks for the detailed report! Actually there’s an answer… my issue is that CMake deletes rpath, and a guy says he made its installation faster by using the same rpath on build & install. We might want to have a look into that.

I checked the suggestion by Gregor and he is right. You can workaround the installation bug by adding -DCMAKE_BUILD_WITH_INSTALL_RPATH=TRUE in the o2.sh recipe for example. But this will render the executables unusable from the build tree unless you explicitely set your LYLD_LIBRARY_PATH. It is possible to workaround even that by setting a relative RPATH:

On MacOS: set(CMAKE_INSTALL_RPATH "@loader_path/../lib")

On Linux (just FYI, because - if you use RPATH on Linux - you should use RUNPATH which is enabled as follows):

set(CMAKE_EXE_LINKER_FLAGS ${CMAKE_EXE_LINKER_FLAGS} "-Wl,--enable-new-dtags")
set(CMAKE_SHARED_LINKER_FLAGS ${CMAKE_SHARED_LINKER_FLAGS} "-Wl,--enable-new-dtags")
set(CMAKE_INSTALL_RPATH "$ORIGIN/../lib")

If you match the RUNTIME/LIBRARY DESTINATIONS in your build tree with your install tree, then the relative RPATH is even correct for both trees.

The point is that we do use executables from the build path (with ctest for
instance). I’ll have a look at the best way to implement this. Thanks for
following this up!

I forgot to mention, that this is already the case for the O2 repo.

I recommend putting this code somewhere in your CMake (Replace ${PROJECT_INSTALL_LIBDIR} with lib or some other variable you are currently using and remove the Linux case, if you don’t want it). And add set(CMAKE_BUILD_WITH_INSTALL_RPATH TRUE). This worked for me in a quick test with O2 on MacOS 10.11.

Ciao,

true, but don’t we always say that if you want to run you need to setup the environment with alienv?

I think we should simply forget about RPATH as it does not match our usecase of “build once, install everywhere”.

@eulisse but the developper usecase is “build frequently, install rarely”, isn’t it ?

How do you invoke ctest?

I personally use alienv setenv O2/latest -c ctest or a separate tmux pane where I did alienv enter O2/latest. Again, our environment is in any case more than just LD_LIBRARY_PATH (think for example the G4 environment variables) so you do need to use alienv in anycase.

RPATH is evil.

1 Like

I rather conclude LD_LIBRARY_PATH is evil :wink: (Isn’t LD_LIBRARY_PATH even disabled by default on recent MacOS?)

die LD_LIBRARY_PATH!
die RPATH!

gimme good ol’ completely static binaries that are easily deployable :slight_smile:

1 Like

btw, I also noticed that e.g. ninja install is awfully slow if done under a full env. (i.e. after module load O2/latest…) while reasonable (but still slow) if done without any module loaded. Has someone else observed this ?

So apparently FairMQ 1.2.2 is actually quite an improvement in build speed. When is the new splitted package layout supposed to be ported to O2?

Ok, I did a workaround in the framework code which seems to improve the situation for me. Can you check?

?

Did a shake -j8 --profile on your PR (top part of image below) vs dev (bottom part).

Not a drastic change but an (unparalleled) gain of 3 minutes still.

I’ve posted the two complete reports at : https://cernbox.cern.ch/index.php/s/unP84Q06pY7uVmA (build of PR#1092) and https://cernbox.cern.ch/index.php/s/jQRqRSAa4AD8q3B (ref build)

Great, thanks for confirming. I will merge this then. There is more work that can be done, at least in the framework, to hide the rest of FairRoot. Hiding ROOT as well would probably help. I suspect we need similar campaigns elsewhere, and I would be all in favour forbidding boost in the APIs.

Could you also try out https://github.com/AliceO2Group/AliceO2/pull/1094?

Hum, pending an error on my side (tried the build twice though), it seems actually a bit slower, or let’s say not faster at least :wink:

https://cernbox.cern.ch/index.php/s/MVRfStBlUF3cifP

I suspect that’s in the profiling noise. Probably the changes are not enough to hide more stuff, so basically the PR is a no-op. Indeed the deps files seem to be unchanged, so I suspect there is something else bringing in the stuff. Will need a bit more massaging I guess.