Monday, July 11, 2011

Everyone's doing a brand new filesystem now: Come on, baby, do the cvmfs now.

Ever since I heard about it at CHEP 2010, I've been itching to get CVMFS set up at Glasgow, because it was so clearly a better solution for software provision than the old sgm-role / NFS-mounted area approach.
Concerns about the reliability of the hardware that the service was running on (it may still not be on production hardware at CERN as I write this) always held the more sensible minds here back, but now that it's all up and working at RAL, and RAL is providing a stratum-1 cache as a backup, there's nothing stopping us.

So, following a combination of Ian Collier's description of the set-up at RAL and the official CernVMFS technical report (pdf), with some adjustments to make changes to our Cfengine config, I spent some of last week getting cvmfs working on the cluster.

For your edification, this is what I did:

1) First, set up the new repository you need. In our case, yum repositories (and gpg keys) are managed by cfengine, so, in our cfengine skel directory for the worker nodes, I added:

wget http://cvmrepo.web.cern.ch/cvmrepo/yum/cernvm.repo -P ./skel/workers/etc/yum.repos.d/
wget http://cvmrepo.web.cern.ch/cvmrepo/yum/RPM-GPG-KEY-CernVM -P ./skel/workers/etc/pki/rpm-gpg/


2) Fuse and cvmfs both want to have user and group entries created for them. We manage users and groups with cfengine, so I added a fuse group to /etc/groups and a cvmfs user and group. The cvmfs user also needs to be added as a member of the fuse group. 

3) Now that the initial set-up bits are done, the new packages can be installed, again, using cfengine. I added the packages
fuse ; fuse-libs ; cvmfs ; cvmfs-keys ; cvmfs-init-scripts

to the default packages for our worker node class in cfengine.

4) Editing configuration files.
You need to edit auto.master to get autofs to support cvmfs.
(Just add a line like

/cvmfs /etc/auto.cvmfs
as the auto.cvmfs map is added by the cvmfs rpm.
Remember to issue a:
service autofs reload
afterwards, or get your configuration management system to do so automagically for you.
)
You also need to configure fuse to allow users to access things as other users:
/etc/fuse.conf
user_allow_other
And finally, you need to actually configure cvmfs itself. Cvmfs uses 2 main configuration files:
default.local, which specifies modifications of the default settings for the local install
cern.ch.local, which specifies modifications of the default server to use for *.cern.ch repositories.

/etc/cvmfs/default.local needs to be configured for:


CVMFS_USER=cvmfs
CVMFS_NFILES=32768
#CVMFS_DEBUGLOG=/tmp/cvmfs.log
CVMFS_REPOSITORIES=atlas.cern.ch,atlas-condb.cern.ch,lhcb.cern.ch,cms.cern.ch,geant4.cern.ch,sft.cern.ch
CVMFS_CACHE_BASE=/tmp/cache/cvmfs2/
CVMFS_QUOTA_LIMIT=10000
CVMFS_HTTP_PROXY="nameoflocalsquid1|nameoflocalsquid2"


/etc/cvmfs/cern.ch.local, for UK sites should probably be configured as:


CVMFS_SERVER_URL="http://cernvmfs.gridpp.rl.ac.uk/opt/@org@;http://cvmfs-stratum-one.cern.ch/opt/@org@"


(since RAL is closer to us than CERN).

A brief note: ';' in a list of options specifies failover, and '|' load-balancing. So "foo;bar" means "try foo, then bar", while "foo|bar;baz" means "try to load-balance queries between foo and bar, if that fails, try baz". This works for the squid proxy specifiers in default.local and also the server destinations in cern.ch.local .

Another note: the cache directory specified in default.local should be large enough to actually cache a useful amount of data on each worker node. 10Gb per VO is reported to be comfortably enough, for atlas and lhcb, and therefore is probably wildly exorbitant for any other VO that would be using it. I've tested, and you can happily set this directory to be readable only by the cvmfs user, which gives you a tiny bit more security.

If you change the configuration files for cvmfs, you need to get it to reload them, like autofs.

service cvmfs reload

seems to work fine (and our cfengine config now does this if it has to update those config files).

In our case, I created the two config files, stuck them in the skel directories for worker nodes in cfengine, and added them to the list of files that are expected to be on worker nodes in the config.



5 ) You can check that all this is working by trying a service cvmfs probe
or explicitly mounting a cvmfs path somewhere outside of automount's config.
With the default config, atlas software is at /cvmfs/atlas.cern.ch and so on.

1 comment:

Unknown said...

Just to be clearer here the stratum1 service at CERN that you reference here is very much on production hardware and well within normal CERN IT operational procedures.

What is not fully in this status is the stratum0 service used essentially for doing the software installations in the first place. This will intime come under production services control.