O2 simulation does not work when QC environment is loaded

Hi experts,

I noticed that o2-sim, o2-sim-digitizer-workflow, and o2-tpc-reco-workflow do not work anymore when the QC environment is loaded. It seems that there is a problem reading/creating files. When only the O2 environment is loaded everything works fine.

This causes problems with some QC workflows that want to use e.g. the output of the o2-tpc-reco-workflow directly where you normally would simply pipe the reco workflow together with the QC workflow. It is not a major issue at the moment (at least not for us) but I guess this is not intended so it should probably be fixed.

I am on Ubuntu 18.04 and alidist, O2, and QualityControl were last updated on Jan 19.

Here are the logs and the loaded modules for O2 and QC:
o2-sim
o2-sim-digitizer-workflow
o2-tpc-reco-workflow
O2 modules
QC modules

Thanks a lot!
Cheers,
Thomas

Looks like the magnetic field file is no longer found (at least when looking in o2-sim-digitizer-workflow file). Maybe some environment variable get’s redefined? In which order to you load O2 + QC? Does the order matter?

I load either one or the other depending on my needs because all modules that are loaded by O2 are also loaded by QC (+ additional modules), right? At least this is my perception (and also seen in alienv list).
However, in the output of alienv list (see the link to the modules above) the order of the modules is indeed different in case this makes a difference.

What I do to load the environments:
O2: alienv -w /path/to/AliSoftware/sw/ load O2/latest
QC: alienv -w /path/to/AliSoftware/sw/ load QualityControl/latest

For completeness: If I first load O2 and then QC on top, I still get the same outcome.

Indeed, the digitization log contains

[15538:TPCDigitizer_0]: [17:31:59][FATAL] MagneticField::loadParameterization: Failed to open magnetic field data file /home/tklemenz/AliSoftware/sw/ubuntu1804_x86-64/FairRoot/v18.4.1-5/share/fairbase/examples/Common/maps/mfchebKGI_sym.root

The mfchebKGI_sym.root is looked in $(O2_ROOT)/share/Common/maps/mfchebKGI_sym.root", which means that your O2_ROOT actually points to FairRoot installation directory.

The question is why? Is QC redefining this environment variable or is it some bug in alienv / loading modules?

Interesting…

If I print O2_ROOT in the terminal I get
/home/tklemenz/AliSoftware/sw/ubuntu1804_x86-64/O2/FileReaderWorkflow-1

Btw, also my colleagues have this problem.

Edit: mfchebKGI_sym.root is present in the directory @shahoian pointed to.

I am taking a look now. Also David saw this problem last week.

1 Like

I confirm the issue. But in my case it is VMCWORKDIR which is not correct when using QC. My suspicion is that this must be a bug in the generated modulefiles (wrong order??).

@swenzel right, after rechecking I confirm the we use a static method

MagneticField::createFieldMap(...const std::string path = std::string(gSystem->Getenv("VMCWORKDIR")) +std::string("/Common/maps/mfchebKGI_sym.root"));

rather than defualt c-tor with $(O2_ROOT)/share/Common/maps/mfchebKGI_sym.root"

@swenzel you are right, the sw/MODULES/ubuntu1804_x86-64/QualityControl/latest 1st loads O2 then Fairroot:

if ![ is-loaded 'O2/dev-1' ] { module load O2/dev-1 }
...
if ![ is-loaded 'FairRoot/v18.4.1-1' ] { module load FairRoot/v18.4.1-1 }

though in the alidist/qualitycontrol.sh the dependencies are mentioned in the correct order. What defines the order in the MODULES?

In principle the code looks correct. O2 is loaded first which internally should load FairRoot. The second FairRoot loading should not happen. I am trying to get some debug/verbose info from the module system now.

how do you load the two of them? alienv enter O2 QC? That will not work in the case the two were built with different set of externals. You need to do simply alienv enter QC.

I would simply remove VMCWORKDIR from the FairRoot modulefile. Why do we need it at all given it points to what seems to be some dummy example data? Alternatively one should check in the same modulefile if VMCWORKDIR is already set and not set it.

The problem is present even when just saying alienv enter QC/latest. Removing VMCWORKDIR might be a hot-fix but I think it would be better understanding the underlying problem.

Which version of QC is this?

I have 1.9.0 (basically software stack from 20.01.2021).

Can you cut & paste the QC modulefile?

That said I disagree it’s just an “hotfix”. Environment variables should be set in a single place: we have no control on the order in which people load their environments (by unfortunate modulefiles design), so having the same variable being defined in multiple packages will have unpredictable results.

I think the feature of being able to override variables should be allowed and it must not happen that a dependency of both O2 and QC is loaded after O2, especially since the modulefile of QC is autogenerated.

In this case, I have of course nothing against removing VMCWORKDIR from FairRoot as it will quickly unblock some work.

Dear all,

for me this issue is solved now after an upgrade of aliBuild, now to version 1.8.0. I did not change the rest of the software (alidist, O2, QC), this is still from March 25, because here I have another issue (but this will be for another topic), but just upgrade the aliBuild I am now able to run the simulations again even when QC (and QCG) is loaded.

Cheers,
Stefan