Good evening,
I have just freshly installed O2 on a new machine (it is a recently formatted macbook pro 2017), running macOS Big Sur 11.6.2 with Intel Iris Plus Graphics 640 1536 MB.
Asking for:
[INFO] This is o2-sim version 1.2.0 (ee59a8f8b)
[INFO] Built by ALIBUILD:v1.11.5, ALIDIST-REV:7165fc352a9f8a5b559ef4b51554432c32296f61 on OS:Darwin-20.6.0
[INFO] BINDING TO ADDRESS ipc:///tmp/o2sim-notifications-76690 type pub
[INFO] Running with 2 sim workers
[INFO] CREATING SIM SHARED MEM SEGMENT FOR 2 WORKERS
shmget: shmget failed: Invalid argument
[INFO] SHARED MEM INITIALIZED AT ID -1
[WARN] COULD NOT CREATE SHARED MEMORY ... FALLING BACK TO SIMPLE MODE
Spawning particle server on PID 76691; Redirect output to o2sim_serverlog
Spawning sim worker 0 on PID 76692; Redirect output to o2sim_workerlog0
Spawning hit merger on PID 76693; Redirect output to o2sim_mergerlog
^C[INFO] o2-sim driver: Signal caught ... clean up and exit
[INFO] o2-sim driver: Signal caught ... clean up and exit
breaks the process.
Running:
[O2Physics/latest-master-o2] ~/aliceO2/pythia-attempt %> fairmq-shmmonitor -c
Cleaning up for shared memory id '9ebc917a'...
Did not find 'fmq_9ebc917a_mng' management segment. No regions to cleanup.
[O2Physics/latest-master-o2] ~/aliceO2/pythia-attempt %>
It seems like I don’t have shmget installed somehow…
For now running o2-sim-serial works, but I understand it is not a long term solution…
indeed, I suspect the issue is that 8GB might not be enough for what you are trying to simulate, when running in parallel. @swenzel (who is currently on holidays) might have more ideas on how to reduce your memory footprint. What command are you using?
I have just started using o2-sim-serial to get familiar with the system, to generate Pythia events for now. I will eventually have to contribute directly on EMC trigger simulations, that is, if my computer allows for it…
The command in specific was: o2-sim -n 1 -g pythia8pp -m EMC
I doubt that we need 8 GB of memory for EMCAL stand-alone simulation pp, simulations I was running were fitting well within this limit. Could it instead be that maybe a default shared memory buffer is allocated for which there is insufficient memory left on the machine. I thought there is a option to pass the size of the shared memory buffer to the workflows.
Yes, it does not matter how much memory is actually used, but how much it’s requested which is actually proportional to the number of workers, if I understand correctly the code. Try reducing that with -j2 or something like that (the default is 5).
Thank you, you are right, I didn’t. I just tried it and it seems to still fail because it is an invalid argument:
[O2Physics/latest-master-o2] ~/aliceO2/pythia-attempt %> cd local
[O2Physics/latest-master-o2] ~/aliceO2/pythia-attempt/local %> o2-sim -j1 -m EMC -n 1000 -g pythia8pp
[INFO] This is o2-sim version 1.2.0 (ee59a8f8b)
[INFO] Built by ALIBUILD:v1.11.5, ALIDIST-REV:7165fc352a9f8a5b559ef4b51554432c32296f61 on OS:Darwin-20.6.0
[INFO] BINDING TO ADDRESS ipc:///tmp/o2sim-notifications-5498 type pub
[INFO] Running with 1 sim workers
[INFO] CREATING SIM SHARED MEM SEGMENT FOR 1 WORKERS
shmget: shmget failed: Invalid argument
[INFO] SHARED MEM INITIALIZED AT ID -1
[WARN] COULD NOT CREATE SHARED MEMORY ... FALLING BACK TO SIMPLE MODE
Spawning particle server on PID 5503; Redirect output to o2sim_serverlog
Spawning sim worker 0 on PID 5504; Redirect output to o2sim_workerlog0
Spawning hit merger on PID 5505; Redirect output to o2sim_mergerlog
Please note that the message from shmget is bogus on MacOS. This never worked, and this is why you see “FALLING BACK TO SIMPLE MODE”, which performs the simulation without shared memory.
From your logs, I don’t see any source of error (is the error that the simulation hangs or exits without correct results??). Please share files o2sim_serverlog, o2sim_workerlog0, o2sim_mergerlog for further inspection.
Good morning, it seems like it fails to deploy the workers. Surprisingly now it doesn’t crash anymore but it just hangs forever trying to deploy them. Not sure which one is worse between the complete crushing or the hanging… I am sending the logs via mail.
Thanks,
Simone
Most likely it hangs during access to CCDB objects. Please verify that you have a valid ALIEN (GRID certificate) token before launching the simulation.