LCG 2 Tar Distribution



Document identifier:
Date: 22 August 2005
Author: CERN GRID Deployment Group
Version: v2.4.0
Abstract: This document gives additional information on how to install and configure a WN or UI using the tar ball distribution.

Contents

Introduction

This document provides some additional information on installing a UI or WN with the relocatable tar distribution.

This distribution can be used on a single node, or shared accross multiple machines. There are two main ways the tar distribution can be shared. You can unpack the tar distribution on an NFS server and share it read-only. Note that during the configuration phase, the NFS mount will have to be rw (or the configuration carried out on the NFS server). If you are sharing the directory to a number of nodes, the mountpoint needs to be uniform across all the nodes. The other possibility would be to re-tar the file once the configuration steps have been done and to copy then untar the file onto the other nodes. The tar file distribution was to enable a flexible set up of the software so a competent system administrator will probably want to tailor the set up for the site.

Preparing a node

The following are the steps required to prepare a node for use with the tar distribution. These steps should only be required once. After a node has been prepared, each new release should only require the updating of the tar distribution.

Base Operating System

A working RedHat 7.3 or SL3 system, with NTP configured correctly, needs to be installed. You are recommended to fully patch the system.

Upgrading From an Rpm Based Installation

If the tar distribution is being deployed on a node that previously had grid middleware components installed, the following steps are required.

WN Batch System Client installation.

If the node is to be a WN, install and configure it for use with your batch system and ensure that it is functioning correctly.

Note that the script will not configure any LRMS.

If you want to install the torque client software, you should check the contents of torque-client-rpm.h for your release and install the rpms listed there. You can find the file here

http://lcgdeploy.cvs.cern.ch/cgi-bin/lcgdeploy.cgi/lcg2/rpmlist_sl3/

If you're configuring torque for the first time, you may find the config_users and config_torque_client yaim functions useful. These come with the relocatable distro and can be invoked like this

${INSTALL_ROOT}/lcg/yaim/scripts/run_function site-info.def config_users
${INSTALL_ROOT}/opt/lcg/yaim/scripts/run_function site-info.def config_torque_client

Your site-info.def must define the correct CE_HOST or TORQUE_SERVER.

Install dependency software

The relocatable distribution has certain dependencies.

We've made this software available as a second tar file which you can download a nd untar under $EDG_LOCATION. This means that if you untarred the main distribution under /opt/LCG, you must untar the supplementary files under /opt/L CG/edg.

apt-get install lcg-TAR

Please refer to [2] for information on the location of the software and how to configure apt-get.

Install CAs

You need to have the Certification Authorities available. You can either install these directly on each node if you've set up apt-get;

apt-get install lcg-CA

or, if the the middleware is on a network share, put the CAs on there so all nodes can read them.

Configuration

The recommended installation method for this distribution is the the standard generic installation, described here [2].

In order to understand the steps taken by the generic installation, you are refered to the TAR_UI and TAR_WN sections of the LCG configuration reference, [1].

Establishing the WN environment on the CE

If you want to use two different versions of the WN middleware, you can set up two CEs, each of which set the environment to one of the two instances. You can then submit a job to be run in the environment of your choice by sending it to the appropriate CE (specified in your .jdl file).

You have to patch the CE in order to allow it to establish the WN environment, and make sure there are no grid_env.*sh scripts in /etc/profile.d which would erase this. The two files you need are available from http://lcgdeploy.cvs.cern.ch/cgi-bin/lcgdeploy.cgi/lcg-docs/Tar-Dist-Use/.

Running PBS with two CEs

Below are some notes on how to set up PBS to run with two CEs, required for the scenario described above.

Let's imagine we've got two CE's: CE1 and CE2. CE1 is running all the pbs services and CE2 wants to use them.

Steps (once PBS is properly installed in CE1):

  1. On CE2, write the name of the Machine running PBS in /var/spool/pbs/server_name. i.e In CE2 the server_name file should contain CE1.domain.com

  2. Include all the CE2 public keys in all the worker nodes /etc/ssh/ssh_known_hosts file (the keys of CE1 have already been added during the PBS installation process).

    1. The key public files are the following ones (they can be found in /etc/ssh):
              - ssh_host_dsa_key.pub
              - ssh_host_key.pub
              - ssh_host_rsa_key.pub
      

      If you prefer, you can configure SSH to use only one protocol version and key type, in which case you only need to add one of these public keys.

    2. It is important to notice that before including the public keys of CE2 it is compulsory to write in first place the CE2 hostname.
            Example: CE2.cern.ch XXXXRSA_PUBLIC_KEYxxxxx
                     XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
                     CE2.cern.ch XXXXDSA_PUBLIC_KEYxxxxx
                     XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
                     CE2.cern.ch XXXXHOST_PUBLIC_KEYxxxx
                     XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
      

  3. Repeat step 2 so that both CE1 and CE2 know each other's public keys.

  4. Add in the CE1 /etc/hosts.equiv file a line containing the CE2 hostname (CE2.cern.ch).

  5. The CE1 machine needs to be running the pbs_mom, pbs_server and pbs_sched services in order to allow CE2 using PBS. No pbs service nees to be running in CE2.

Bibliography

1
G. Diez-Andino, K. Oliver, A. Retico, and A. Usai.
Lcg configuration reference, 2004.
http://www.cern.ch/grid-deployment/gis/lcg-GCR/index.html.

2
O. K. A. R. A. U. Guillermo Diez-Andino, Laurence Field.
Lcg generic installation guide, 2005.
http://grid-deployment.web.cern.ch/grid-deployment/documentation/LCG2-Manual-Install.

About this document ...

This document was generated using the LaTeX2HTML translator Version 2002 (1.62)

Copyright © 1993, 1994, 1995, 1996, Nikos Drakos, Computer Based Learning Unit, University of Leeds.
Copyright © 1997, 1998, 1999, Ross Moore, Mathematics Department, Macquarie University, Sydney.

The command line arguments were:
latex2html -split 0 -html_version 4.0 -no_navigation -address 'GRID deployment' Tar-Dist-Use.drv_html

The translation was initiated by Oliver KEEBLE on 2005-08-22


GRID deployment