Running async QC (on a laptop)

In order to debug the (yet to be added) async QC of MCH, I’d like to actually run the async QC on a small data sample, on my laptop.

So, even before trying with MCH, I’m trying with another detector for which there’s a json configuration file in O2DPG/DATA/production/qc-async : MID

I’m trying to run :

TFDELAY=0 BEAMTYPE="pp" WORKFLOWMODE="run" WORKFLOW_PARAMETERS=QC WORKFLOW_DETECTORS=MID GEN_TOPO_WORKDIR=$HOME/tmp bash ../O2/prodtests/full-system-test/run-workflow-on-inputlist.sh CTF ctf.list

where ctf.list points to some (locally copied) CTFs :

/Volumes/LaData/alice/data/2022/LHC22m/522622/raw/0810/o2_ctf_run00522622_orbit0205183692_tf0000000001_epn154.root
/Volumes/LaData/alice/data/2022/LHC22m/522622/raw/0810/o2_ctf_run00522622_orbit0205183820_tf0000000002_epn109.root
/Volumes/LaData/alice/data/2022/LHC22m/522622/raw/0810/o2_ctf_run00522622_orbit0205183948_tf0000000003_epn038.root
/Volumes/LaData/alice/data/2022/LHC22m/522622/raw/0810/o2_ctf_run00522622_orbit0205184076_tf0000000004_epn133.root
/Volumes/LaData/alice/data/2022/LHC22m/522622/raw/0810/o2_ctf_run00522622_orbit0205184204_tf0000000005_epn093.root
/Volumes/LaData/alice/data/2022/LHC22m/522622/raw/0810/o2_ctf_run00522622_orbit0205184332_tf0000000006_epn020.root
/Volumes/LaData/alice/data/2022/LHC22m/522622/raw/0810/o2_ctf_run00522622_orbit0205184460_tf0000000007_epn184.root
/Volumes/LaData/alice/data/2022/LHC22m/522622/raw/0810/o2_ctf_run00522622_orbit0205184588_tf0000000008_epn126.root
/Volumes/LaData/alice/data/2022/LHC22m/522622/raw/0810/o2_ctf_run00522622_orbit0205184716_tf0000000009_epn085.root
/Volumes/LaData/alice/data/2022/LHC22m/522622/raw/0810/o2_ctf_run00522622_orbit0205184844_tf0000000010_epn237.root

doing so I get lots of error messages from the proxies :

[68444:mid-tracks-proxy]: [17:06:33][ERROR] could not resolve hostname 'any', reason: resolve: Host not found (authoritative)
[68444:mid-tracks-proxy]: [17:06:33][ERROR] failed to attach channel mid-tracks[0] (connect)

Any idea on why I’m getting those ? Are some there env variable that I forget to set ?
Or should the async QC be ran some other way ?

Running with WORKFLOWMODE=print I get :

Started /Users/laurent/alice/spack/laurent/qc/O2/prodtests/full-system-test/dpl-workflow.sh with PID 71096
#Workflow command:

o2-ctf-reader-workflow --session default_71087_23644 --severity info --shm-segment-id 0 --shm-segment-size 8589934592  --early-forward-policy noraw --monitoring-backend no-op:// --fairmq-rate-logging 0 --timeframes-rate-limit 288 --timeframes-rate-limit-ipcid 0 --delay 0 --loop 0  --ctf-input ctf.list   --onlyDet MID  --pipeline tpc-entropy-decoder:1  --configKeyValues "NameConf.mDirGeom=/Users/laurent/alice/spack/laurent/qc/qc-async;NameConf.mDirGRP=/Users/laurent/alice/spack/laurent/qc/qc-async;keyval.input_dir=/Users/laurent/alice/spack/laurent/qc/qc-async;keyval.output_dir=/dev/null;;" | \
o2-mid-reco-workflow --session default_71087_23644 --severity info --shm-segment-id 0 --shm-segment-size 8589934592  --early-forward-policy noraw --monitoring-backend no-op:// --fairmq-rate-logging 0 --timeframes-rate-limit 288 --timeframes-rate-limit-ipcid 0 --disable-root-output --disable-mc --pipeline MIDClusterizer:1,MIDTracker:1  --configKeyValues "NameConf.mDirGeom=/Users/laurent/alice/spack/laurent/qc/qc-async;NameConf.mDirGRP=/Users/laurent/alice/spack/laurent/qc/qc-async;keyval.input_dir=/Users/laurent/alice/spack/laurent/qc/qc-async;keyval.output_dir=/dev/null;;" | \
o2-primary-vertexing-workflow --session default_71087_23644 --severity info --shm-segment-id 0 --shm-segment-size 8589934592  --early-forward-policy noraw --monitoring-backend no-op:// --fairmq-rate-logging 0 --timeframes-rate-limit 288 --timeframes-rate-limit-ipcid 0  --disable-mc --disable-root-input --disable-root-output --vertexing-sources MID --vertex-track-matching-sources MID --skip --pipeline primary-vertexing:1,pvertex-track-matching:1  --configKeyValues "NameConf.mDirGeom=/Users/laurent/alice/spack/laurent/qc/qc-async;NameConf.mDirGRP=/Users/laurent/alice/spack/laurent/qc/qc-async;keyval.input_dir=/Users/laurent/alice/spack/laurent/qc/qc-async;keyval.output_dir=/dev/null;;pvertexer.maxChi2TZDebris=10;;;" | \
o2-qc --session default_71087_23644 --severity info --shm-segment-id 0 --shm-segment-size 8589934592  --early-forward-policy noraw --monitoring-backend no-op:// --fairmq-rate-logging 0 --timeframes-rate-limit 288 --timeframes-rate-limit-ipcid 0 --config json:///Users/laurent/tmp/json_cache/20221014-171546-71096-11348--MID.json --local --host localhost   --configKeyValues "NameConf.mDirGeom=/Users/laurent/alice/spack/laurent/qc/qc-async;NameConf.mDirGRP=/Users/laurent/alice/spack/laurent/qc/qc-async;keyval.input_dir=/Users/laurent/alice/spack/laurent/qc/qc-async;keyval.output_dir=/dev/null;;" | \
o2-dpl-run --session default_71087_23644 --severity info --shm-segment-id 0 --shm-segment-size 8589934592  --early-forward-policy noraw --monitoring-backend no-op:// --fairmq-rate-logging 0 --timeframes-rate-limit 288 --timeframes-rate-limit-ipcid 0

Cleaning up for shared memory id 'ccd65265'...
Did not find 'fmq_ccd65265_mng' management segment. No regions to cleanup.
Done processing ctf.list in CTF mode

and the generated QC configuration is :

$ cat /Users/laurent/tmp/json_cache/20221014-171546-71096-11348--MID.json 
{
  "qc": {
    "config": {
      "database": {
        "implementation": "CCDB",
        "host": "ali-qcdb.cern.ch:8083",
        "username": "not_applicable",
        "password": "not_applicable",
        "name": "not_applicable",
        "maxObjectSize": "20000000"
      },
      "Activity": {
        "number": "REPLACE_ME_RUNNUMBER",
        "type": "2",
        "passName": "REPLACE_ME_PASS",
        "periodName": "REPLACE_ME_PERIOD",
        "provenance": "qc_async"
      },
      "monitoring": {
        "url": "infologger:///debug?qc"
      },
      "consul": {
        "url": ""
      },
      "conditionDB": {
        "url": "alice-ccdb.cern.ch"
      },
      "infologger": {
        "filterDiscardDebug": "true",
        "filterDiscardLevel": "1"
      }
    },
    "tasks": {
      "QcTaskMIDDigits": {
        "active": "true",
        "className": "o2::quality_control_modules::mid::DigitsQcTask",
        "moduleName": "QcMID",
        "detectorName": "MID",
        "cycleDurationSeconds": "60",
        "maxNumberCycles": "-1",
        "dataSource": {
          "type": "dataSamplingPolicy",
          "name": "mid-digits"
        }
      },
      "QcTaskMIDClust": {
        "active": "true",
        "className": "o2::quality_control_modules::mid::ClustQcTask",
        "moduleName": "QcMID",
        "detectorName": "MID",
        "cycleDurationSeconds": "60",
        "maxNumberCycles": "-1",
        "dataSource": {
          "type": "dataSamplingPolicy",
          "name": "mid-clusters"
        }
      },
      "QcTaskMIDTracks": {
        "active": "true",
        "className": "o2::quality_control_modules::mid::TracksQcTask",
        "moduleName": "QcMID",
        "detectorName": "MID",
        "cycleDurationSeconds": "60",
        "maxNumberCycles": "-1",
        "dataSource": {
          "type": "dataSamplingPolicy",
          "name": "mid-tracks"
        }
      }
    },
    "checks": {
      "QcCheckMIDDigits": {
        "active": "true",
        "className": "o2::quality_control_modules::mid::DigitsQcCheck",
        "moduleName": "QcMID",
        "detectorName": "MID",
        "policy": "OnAny",
        "checkParameters": {
          "MeanMultThreshold": "100."
        },
        "dataSource": [
          {
            "type": "Task",
            "name": "QcTaskMIDDigits",
            "MOs": [
              "mMultHitMT11B",
              "mMultHitMT12B",
              "mMultHitMT21B",
              "mMultHitMT22B"
            ]
          }
        ]
      },
      "QcCheckMIDClust": {
        "active": "true",
        "className": "o2::quality_control_modules::mid::ClustQcCheck",
        "moduleName": "QcMID",
        "detectorName": "MID",
        "policy": "OnAny",
        "dataSource": [
          {
            "type": "Task",
            "name": "QcTaskMIDClust",
            "MOs": []
          }
        ]
      },
      "QcCheckMIDTracks": {
        "active": "true",
        "className": "o2::quality_control_modules::mid::TracksQcCheck",
        "moduleName": "QcMID",
        "detectorName": "MID",
        "policy": "OnAny",
        "dataSource": [
          {
            "type": "Task",
            "name": "QcTaskMIDTracks",
            "MOs": []
          }
        ]
      }
    },
    "externalTasks": null,
    "postprocessing": null
  },
  "dataSamplingPolicies": [
    {
      "id": "mid-tracks",
      "active": "true",
      "machines": [],
      "query": "tracks:MID/TRACKS;trackrofs:MID/TRACKROFS",
      "samplingConditions": [
        {
          "condition": "random",
          "fraction": "0.1",
          "seed": "1441"
        }
      ],
      "blocking": "false"
    },
    {
      "id": "mid-clusters",
      "active": "true",
      "machines": [],
      "query": "clusters:MID/TRACKCLUSTERS;clusterrofs:MID/TRCLUSROFS",
      "samplingConditions": [
        {
          "condition": "random",
          "fraction": "0.1",
          "seed": "1441"
        }
      ],
      "blocking": "false"
    },
    {
      "id": "mid-digits",
      "active": "true",
      "machines": [],
      "query": "digits:MID/DATA;digits_rof:MID/DATAROF",
      "samplingConditions": [
        {
          "condition": "random",
          "fraction": "0.1",
          "seed": "1441"
        }
      ],
      "blocking": "false"
    }
  ]
}

Thanks,

For the record I was able to make some progress.

The issue is that with my original attempt to --local option was given to o2-qc which I guess triggered the qc to try to generate various task proxies (which failed as there’s no remote).

So instead I now define the QC_CONFIG_PARAM variable to force the usage of the --local-batch option :

TFDELAY=0 BEAMTYPE="pp" WORKFLOWMODE="run" QC_CONFIG_PARAM="--local-batch=QC.root" WORKFLOW_PARAMETERS=QC WORKFLOW_DETECTORS=MCH GEN_TOPO_WORKDIR=$HOME/tmp bash ../O2/prodtests/full-system-test/run-workflow-on-inputlist.sh CTF ctf.list

and all seem to work now (so far at least :wink: )

Hi, sorry, I was away during the last two weeks, but just to confirm: this is indeed what you needed. Here is also some example of this: