Module vs alienv

laphecet · May 24, 2018, 11:38am

I don’t know exactly what alienv is doing more than module but it is (way) slower. And as such I got into the habit a while ago of using the module command whenever I can (also because since I discovered modulefiles I’m using them for other things as well) (*).

But now, triggered by a comment by Dario in https://github.com/alisw/alidist/issues/1122#issuecomment-391488689
I’d be interested to know what are the caveats (in the context of O2) of using module instead of alienv

Thanks,

(*) The one noticeable exception is to execute a command where I do use alienv setenv ... -c cmd (and get hit by the slowness)

sbinet · May 24, 2018, 12:05pm

I’ve personally migrated all my alienv-like and module-like uses to direnv (which, as its name implies, creates a new environment crafted on a per-directory basis: https://direnv.net/)
I discovered it when I was working on the CMT replacement (back in my ATLAS days.)

(apologies for being slightly off-topic)

dberzano · May 24, 2018, 1:25pm

Hi @sbinet, thanks for the tip. I personally dislike environment modules, and I dislike TCL even more. Unfortunately we’ve decided to use that for compatibility with what we previously had on the Grid. I would go to the extent to say that, in order to manage environment variables, you don’t even need anything more than… a shell, so even direnv is too much IMHO

Basically it collects all modulefiles in one place at each run. This is slow. For the moment, you can avoid this slowness by appending --no-refresh. I agree that there are huge margins of improvement there.

Anyway the point in using alienv is: if we switch to something other than environment modules, we won’t need to tell users the new commands. What alienv does more than module can be seen in the source code:

as said, it collects modulefiles in one place,
it sets the MODULEPATH properly.

It’s nothing more than a convenience layer on top of module that outputs stuff in a way that’s compatible with what we already had on the Grid. But the reason they exist is point 1), and I do see margins of improvements there - I appreciate if you have suggestions/PRs.

sbinet · May 24, 2018, 1:58pm

direnv just needs a shell hook (and direnv, a completely statically compiled binary).

when I was playing with the CMT replacement, I toyed with another completely statically compiled binary (having at that time already drunk the Go koolaid) that was loading given environment configuration from a single binary file.
you’d do:

$> hwaf run athena.py some-jobo.py

pros:

no stat(3) storm across a possibly deeply nested filesystem,
no pollution of the user environment,
one can run 2 jobs from different releases from the same shell/terminal,

anyways…

back to fer…

laphecet · May 24, 2018, 2:17pm

Well, there is a Lua version if you prefer

Anyway, one feature I like beside loading and unloading the env. (env which anyway should be kept minimal to avoid giant *_PATH hierarchies, I agree) is that I can discover what’s there to be used, using module avail and module show.

Back to slowness issue, @dberzano, will try to have a look if I can, but a stupid question before, maybe (did not look at the code yet): why aren’t the modulefiles already put in one place by aliBuild ?

dberzano · May 24, 2018, 2:37pm

It’s absolutely not a stupid question. This would actually solve the problem. Back in 2015, we’ve simply started doing things like this and never changed it. There should still be a way to force-refresh the modules dir too. I believe that the best thing we can do is:

make aliBuild call alienv just to refresh the modules,
invoke alienv with --no-refresh by default

swenzel · May 24, 2018, 2:40pm

Fully agree with Laurent. To repeat: Modules are fantastic … in particular because I have the full control and can even switch/unload packages from an environment + I am able to query what is loaded.
(In this sense it is much better than sourcing an enviroment from recursive shell scripts).

This technology is surely old but still a standard technology in HPC clusters around the world.

Who cares if something is using Tcl?

dberzano · May 24, 2018, 3:26pm

I didn’t mean to choose a technology over another because I like/dislike it - as “librarian” and “infrastructure manager” and TCL is quite painful to this respect (X bindings et al.). My personal opinion on TCL is not how we take decisions: we have, indeed, taken into consideration that it’s a standard for HPC. So my comment on TCL was a pour parler and modules are going to stay, expecially now that they seem to be actively maintained after years of inactivity.

Practically speaking, the sole problem we have is speed. alienv is slow, we’ve identified the reason and we have a possible solution, so I think this matter is settled

bvonhall · June 12, 2018, 9:17am

I have another major issue : starting several instances of alienv at once fail consistently (only one will pass). It is a problem when you want to put it in your .bashrc or if you try to set up several terminals at once (e.g. in terminator) or if you make a benchmark and launch many instances of a software at once.

dberzano · June 12, 2018, 9:39am

Hi @bvonhall,

this works around your problem as well.

bvonhall · June 12, 2018, 9:40am

Excellent, thanks !