Running reconstruction on multiple EPNs and QC tasks on merged output

sheckel · May 7, 2021, 5:15pm

Dear @bvonhall, dear all,

we are preparing some tests for the TPC-QC in the upcoming Milestone Week 5. We have already successfully tested locally (on data replay) a full reconstruction chain with each one QC task on the raw data and during the reconstruction, and several QC tasks at the last stage after reconstruction.

Now during MW5, the synchronous reconstruction of the TPC data will run in parallel on multiple EPNs and in the end we would like to have the combined (merged) output of all the QC tasks. Here we have two cases and related questions:

The tasks running before or during reconstruction will have to run on all EPNs. Is it already possible to merge the output of those tasks, e.g. on a dedicated QC node?
Concerning the other tasks, we would like to ask, if it is already possible to not run them on the EPNs, but instead collect and merge the output of the reco of all EPNs and then run these tasks only once on a QC node?

Thanks a lot for any help with this!

Cheers,
Thomas and Stefan

bvonhall · May 10, 2021, 7:38am

Dear Thomas & Stefan,

Here is my take:

If you run with the scripts, yes we can setup merger(s) as requested. We would actually do exactly what you do, I think, on your local setup: QualityControl/Advanced.md at master · AliceO2Group/QualityControl · GitHub
@pkonopka, correct me if I am wrong please.
Would you run on the stored output, ie. files ? in this case I think that it is trivial as you could even do it by hand. And of course not to run something just means that you remove it from the config. If you mean a proper post-processing task, we have not done it yet at P2.

Cheers,
Barth

tklemenz · May 10, 2021, 8:29am

Hello @bvonhall,

thanks for the link to the docu. I wasn’t aware of that part.

The options described above are basically

Run QC tasks on multiple nodes locally, merge their outputs on a remote machine and then publish to QCG, which is covered by the docu you provided, as far as I understand (but there will be a few questions later, probably)
Collect data samples (e.g. TPC tracks) from multiple nodes (running e.g. online reconstruction) on a remote machine and run the QC task on that machine, then publish QC output to QCG

We need to think if option 2 would even be needed because QC task output needs to be mergeable anyway for option 1 so option 2 could probably as well be achieved via option 1.

Cheers,
Thomas

tklemenz · May 12, 2021, 5:17pm

Hi @bvonhall,

for option 1 described above:
Would it in general be possible to implement the possibility to split a qc task in two parts, so that part A is run on local nodes, which send MOs to the remote machine and then the remote machine performs part B of the task on the MOs before publishing?

This would for example be convenient for objects where merging needs further information from other MOs to be done correctly.

Cheers,
Thomas

bvonhall · May 17, 2021, 6:57am

Hi Thomas,

At the moment, it is not possible. We have to discuss with Piotr and see the implication. You could possibly achieve the same with post-processing, i.e. a task that would get all the sub-objects and merge them “smartly” (and we would trash the sub-objects and keep only the merged one).

Cheers,
Barth

tklemenz · May 17, 2021, 8:49am

Ok, thanks for the info!
I will think about the post-processing way.

Cheers,
Thomas