Friday, October 30, 2009

worker node on demand

Virtualisation is a hot topic again for grid services and worker node on demand

KVM, XEN, VMWARE - Everyone using different ones.
Virtualisation for cloud - Nimbus, Open Nebula, eucalyptus

the future... ??
1. plain signed virtual images transported from site to site.
2. virtual images including experiment software.
3. connecting to pilot job frameworks, instantiated with virtual images,
4. pilot frameworks replaced by commercial domain schedulers. virtual clusters.

Monday, October 26, 2009

HEPIX is GO!

HEPIX Workshop

Site Reports Session

CERN:
Getting serious about ITIL. Solaris being phased out. Getting serious about 10GigE.
Lustre pilot project. New purchases discussed.

JLAB:
New LQCG Cluster "2009 Quad Infiniband - ARRA Cluster"
Storage - Whitebox 14 AMAXservers Solaris w/ZFS or Lustre
Compute - DellpowerEdge R4102 x4 Ghz QDR Infiiband, 24Gb RAM

Auger Cluster Upgraded
Nehalems - intel x5530 dual cpu, quad core, 24MB RAM, 500GB SATA
(seeing i/o contention on disk when running 14/16 jobs)
OS Switch from Fedora 8 32bit, to CentOS 5.3 64bit

No real Grid Computing
IBM TS3500 tape library installed. StorageTek Powderhorn silos replaced.
80 production VM's VMWare ESX3.5 planned to move to vSphere4.0

GSI:
FAIR - new accelerator discussion. The futuristic talk!
The Cube DataCentre Building: 1000 19" water cooled racks held in 26x26x26 cube building. Lifts to reach the machines. Iron structure for racks to sit on.

CINP2P3 LYON:
T1 4LHC & D0, Babar, SL5 migration in Q2 2010 for both Main Cluster and MPI Cluster. New Purchases and New Server Building.

STORAGE Session

Your File System NexGen openAFS (Jeffery Altman):
YFS now funded by US Gov to create nextgen openAFS. 2 year funding. Deliverables included assessment of current AFS and 2 year upgrade plan to client and server for YFS deliverable. Still open source.

Storm and Lustre:
IOZONE discussion, Hammer-cloud Tests Discussion, Benchmarking summary, Good Results, performance below iozone tests. WMS jobs and Panda jobs different. file::// protocol support performs well but requires the VO to support it. Open questions: Lustre Striping (should yes or no). Performance (Raid config?), Monitoring - still work to be done, Support - Kernel Upgrades can take a while to be made available and Benchmarks - are they realistic? Tuning still to do.

Lustre at GSI:
Users - Alice Analysis for Tier2, GSI Exp, FAIR Simulations. Still on 1.6.7.2 1Pbtye, > 3000 nodes. Foundry RX32 ethernet switch. MDS HA Pair, one standby. 84 OSS, 200 OSTs. MDS 8 core, 3GHz Xeon, 32Bb RAM. Real throughput testing with Alice Analysis Train. 50Gbit/s using 2000 cores. Hardware and Software issues. Complex system and vulnerable to network communications. Using Robin Hood Filesystem Monitor for audit and management. This protects the MDS by directing requests to MYSQL instance. i.e top ten users, file moves etc. Using this rather than e2Scan.

Hadoop on your worker nodes using local hard drives & Fuse:
Hadoop compared against Lustre. Performed well when 8 jobs ran. Replication of files provides redundancy. Cost and maintenance factor very favourable to small sites. Deployed in some sites in the US. Not a really Tier 1 deployable solution. Name node redundancy exists (will lose at most one transaction) - requires additional software.

Virtualization Session

lxcloud at CERN:
Cern has developed a proof of concept for virtualized worker nodes. 'Golden nodes' serving images to the Xen Hypervisors using Open Nebula. Also looked at Platform's VMO. Production lxcloud being built. 10 machines, 24GB, 2TB disk dual Nehalem. Starting with Xen. Production release by March 2010. Memory an issue as the HyperVisor requires some memory i.e. with 16GB RAM you cannot run 8 2GB VM's.

Fermigrid:
Has moved much of its infrastructure to Xen HyperVisor. Looks like a solid infrastructure. Investigating KVM with the possibility of a move in the next few years if it proves to be better. INFN mentioned Xen vs KVM at Hepix Spring 2009 for discussion of differences.

Monday, October 19, 2009

Another new VO at Glasgow

Today I finally got time to create a new VO for our new users in Solid State Physics.
vo.ssp.ac.uk
This is now active across the cluster and users can sign up to the VO from our voms server on svr029 and will be used to host users of CASTEP and other departmental SSP users.

Our local wiki page on running CASTEP at Glasgow. Only the MPI version to get working now.

Monday, October 12, 2009

CASTEP, A Test of True Grid

Along came another users with a requirement for MPI. Can we run it? Well yes you can but remember our interconnects are just plain old Ethernet and nothing fancy like Myrinet or Infiniband. We are not a HPC cluster but an HTC cluster.

So we have been building CASTEP, an f90 code, heavy on the MPI scatter/gather. A test of true grid for any HTC cluster. First off CASTEP requires a minimum of make3.81 and gfortran43. Handy that we moved to SL5 as these are now the standard. Coupled with making sure that the required libs fftw3, blas and lapack are all built with the same compiler, gfortran43. This allowed the single core version to be built and installed onto the grid.

An MPI version is turning out be a bit more work. First off the old, outdated and no longer developed libs MPICH have not been built with .f90 support enabled by default. So we have got hold of the source to do a recompile with .f90 support on for gfortran43. There also appeared to be a bug in the gfortran support. So we had to patch the src rpm to include a patch that we located online. This allowed us to finally build the mpich lib. This has been tested with compilation of an MPI job in c and f90, both of which run successfully.

Unfortunately CASTEP still doesn't run using it so more digging required.