Ninja anyone?

An alidist PR is indeed in the works along those lines :wink:
Will try to post it today.

1 Like

PR done https://github.com/alisw/alidist/pull/1136. Let’s see how it goes :wink:

I did some timings while putting the PR in place, see
https://www.aphecetche.xyz/2018/05/09/ninja-vs-make/

To be really thorough one might imagine to check that all build artifacts (bin,libs) are actually the same for the make and ninja cases. Might actually do that next, as it might be useful if we want to evaluate more build systems at some point ?

Do you understand why system time is so high in both cases?

@eulisse : no

Actually, finding out where the time is spent in our build might be of some interest.

Don’t know how to do it with ninja proper but apparently shake is able to use ninja files and seems to have a very nice build profiling output (report.html, that is query-able)

@costing could be interested in this.

Is it easy to integrate it into our builds? It would be nice to have such profiles produced automatically. We do that already with coverage.

Can we get the report somewhere? Does it include compilation time only or also the overhead? I’ve the impression that we have something stupid (e.g. starting bash with interactive configuration) which dominates over compile time.

That was quite precisely my request, see above. I’d like to have a badge with it as you have done with Coverage.

On build time, well, I have made some research and opened a different topic earlier:

@eulisse : the report I was referring to should now be available at
https://cernbox.cern.ch/index.php/s/TArbTGYGg5wqNbt

As far as I can tell, I think the report covers everything. In the report, if you go to the “command plot” dropdown menu you’ll see 3 main area : c++ compilation (green), cd (red, which is not cd but dictionary / rootmap generation AFAIK), and “:” (light blue) which I’ve not figured out yet what it is ;-). Using the “command table” you get the numbers for each command (the number are the unparallel ones if I get it correctly, e.g. the total build is not 35minutes !)

c++ 459 × 35m24s 90.17%
cd 47 × 3m01s 7.67%
: 183 × 36.61s 1.55%
cmake 1 × 8.37s 0.36%
cc 1 × 0.13s 0.01%

@dberzano : I’m no expert on shake itself (just installed it myself last week), but basically, once shake is installed (this requires the haskell platform…) and cmake has generated ninja files, it’s just a matter of doing shake -jN --profile (the -j is important as shake is using 1 core by default where ninja for instance is using everything available by default), so I’d say it should be easy to integrate, yes.

OK if you don’t mind I’ll try doing it for the next WP3 as a demonstrator!

oh, and to be clear, what I timed was just the build (i.e. no install)

Thanks… Looks like the report reports only the first command of a rule and the 8% spent in cd is really spent doing the rootmap rule (which probably starts by cd-ing into some temporary dir.

Can you run once more with:

–profile=report.trace

?

output of :

shake --no-build --profile=report.trace
available at :
CERNBox

Concerning the cd commands, yes, those are most probably the rootmaps (and a pcm copy afterwards as well ?)

e.g.

❯ ninja -t commands libCCDB.dylib | head -1
cd /Users/laurent/tmp/build/O2-shake/CCDB && 
DYLD_LIBRARY_PATH=/Users/laurent/alice/sw/osx_x86-64/ROOT/latest-clion-o2/lib: ROOTSYS=/Users/laurent/alice/sw/osx_x86-64/ROOT/latest-clion-o2 
/Users/laurent/alice/sw/osx_x86-64/ROOT/latest-clion-o2/bin/rootcint -f /Users/laurent/tmp/build/O2-shake/CCDB/G__CCDBDict.cxx 
-inlineInputHeader -rmf /Users/laurent/tmp/build/O2-shake/lib/libCCDB.rootmap 
-rml CCDB.dylib -c -I/Users/laurent/alice/sw/osx_x86-64/FairRoot/latest-clion-o2/include -I/Users/laurent/alice/sw/osx_x86-64/FairRoot/latest-clion-o2/include/fairmq -I/Users/laurent/alice/sw/osx_x86-64/ROOT/latest-clion-o2/include -I/Users/laurent/alice/o2-dev/O2/CCDB -I/Users/laurent/alice/o2-dev/O2/CCDB/include -I/Users/laurent/alice/o2-dev/O2/CCDB/src -I/Users/laurent/alice/sw/osx_x86-64/boost/latest-clion-o2/include -I/Users/laurent/alice/sw/osx_x86-64/protobuf/latest-clion-o2/include include/CCDB/Backend.h include/CCDB/BackendOCDB.h include/CCDB/BackendRiak.h include/CCDB/Condition.h include/CCDB/ConditionId.h include/CCDB/ConditionMetaData.h include/CCDB/FileStorage.h include/CCDB/GridStorage.h include/CCDB/IdPath.h include/CCDB/IdRunRange.h include/CCDB/LocalStorage.h include/CCDB/Manager.h include/CCDB/ObjectHandler.h include/CCDB/Storage.h include/CCDB/XmlHandler.h test/TestClass.h /Users/laurent/alice/o2-dev/O2/CCDB/src/CCDBLinkDef.h && 
/usr/local/Cellar/cmake/3.11.1/bin/cmake -E copy_if_different /Users/laurent/tmp/build/O2-shake/CCDB/G__CCDBDict_rdict.pcm /Users/laurent/tmp/build/O2-shake/lib/G__CCDBDict_rdict.pcm

For the record, I did some further investigation and it looks like a dependency on boost.fusion is creeping in in a few places, generating huge depends.make files… Not sure where it comes from yet, but it’s probably a good idea to set a policy to avoid any boost related matter to creep into the header.

Could you see the same issue also with ninja-build, or is ninja smarter here?

I think ninja does not need that and it’s were it actually simplifies over make. That’s why the incremental builds are faster when using it (of course I could be wrong). However, even if we switch to ninja, that will still be slow in compilation if we have some header which accidentally brings in boost.fusion, as it will have to be parsed for each inclusion.

Well, I guess Ninja does need the same kind of information but is probably a bit smarter in storing it.

Ninja has (as far as I can tell) its include dependencies in a single (binary) .ninja_deps (that can be inspected using the ninja -t deps sub-command), while we get 227 depends.make files…

And the .ninja_deps file is 4.1 MB for the build above while some single depends.make are bigger than that :

find . -name depend.make -exec ls -s {} \; | sort -n
...
4192 ./Framework/Utils/CMakeFiles/test_DPLOutputTest.dir/depend.make
4200 ./Utilities/aliceHLTwrapper/CMakeFiles/aliceHLTwrapper.dir/depend.make
4208 ./Detectors/TPC/calibration/CMakeFiles/TPCCalibration.dir/depend.make
4224 ./Detectors/TPC/base/CMakeFiles/TPCBase.dir/depend.make
5992 ./Framework/Utils/CMakeFiles/test_DPLBroadcasterMerger.dir/depend.make
6152 ./CCDB/CMakeFiles/CCDB.dir/depend.make
6152 ./Framework/Utils/CMakeFiles/DPLUtils.dir/depend.make
8200 ./Detectors/TPC/workflow/CMakeFiles/tpc-reco-workflow.dir/depend.make
10304 ./Utilities/DataFlow/CMakeFiles/DataFlow.dir/depend.make
13512 ./Detectors/MUON/MCH/Mapping/Impl3/CMakeFiles/MCHMappingImpl3.dir/depend.make
22552 ./Framework/Core/CMakeFiles/Framework.dir/depend.make

(ls -s giving sizes in 512-bytes block)

But regardless of the build system, I agree with Giulio that we should anyway be very careful about our dependencies…

So the problem is that a FairMQDevice derives from a FairMQStateMachine which itself derives from a boost::state_machine which basically requires the whole boost to be parsed. Since this happens in the headers it means that anything including FairMQDevice.h will take forever to compile. Moreover, given FairMQDevice is what we use to send messages, anything which sends messages will bring in the whole boost in. I think this is a design issue with FairMQ and they should refactor things so that the state machine internal details are not exposed to the world. The reason why the framework is so slow to compile is merely the fact that we made very easy for people to write devices and send messages so we are simply exposing a scalability issue with the current FairMQDevice.h header.

I will try to follow this up with the FairMQ people.

Apparently a known issue… https://github.com/FairRootGroup/FairRoot/issues/775