|Date:||31 January 2006|
|Author:||CERN GRID Deployment Group (<email@example.com>)|
Please see the following page for details of the current release http://lcg.web.cern.ch/LCG/Sites/releases.html.
These notes are provided to describe the process of setting up and registering a grid site using the middleware packaged by LCG. This middleware represents the current middleware stack used in the LCG-2 and EGEE production grid. This information is relevant for site managers or sysadmins that want to setup a EGEE/LCG-2 production site or upgrade their site to the latest release.
This is best answered by material found on the projects web site http://lcg.web.cern.ch/LCG/ . From there you can find information about the nature of the project and its goals. At the end of the introduction you can find a section that collects most of the references.
This release is available for Scientific Linux 3 (SL3) and compatible distributions.
If you want to join and add resources to it you should contact the LCG deployment manager Ian Bird Ian.Bird@cern.ch to establish the contact with the project.
The support for sites is organized in a hierarchical way. Please contact the managers at the regional operations centres (ROCs) for your region. In case your site is not covered by the following list you should contact Ian Bird and your site will be either connected to one of the existing ROCs or the CERN deployment team will provide the required support.
The ROC managers mailing list and a look at the EGEE project web page and there especially the SA1 page might help to find a matching ROC: firstname.lastname@example.org
The formal process to become a site in LCG2/EGEE is currently adapted to the new structure given by the EGEE project. Until this is finalized the following steps should be followed. Please include in your mail exchanges with your ROC the deployment team email@example.com After the initial step described above you should follow these steps:
In addition you have to go through some additional steps after you have started the setup of your site:
You should first register as a user and subscribe to the LCG Rollout mailing list (http://www.listserv.rl.ac.uk/archives/lcg-rollout.html ). On this list new releases are announced and it is the common place to exchange information.
It is quite useful to have a look at the user guide:https://edms.cern.ch/file/454439//LCG-2-Userguide.pdf . In addition you need to contact the Grid Operation Centre (GOC) (http://goc.grid-support.ac.uk/gridsite/gocmain/ ) and get access to the GOC-DB for registering your resources with them. This registration is the basis for your system being present in their monitoring. It is mandatory to register at least your service nodes in the GOC DB. It is not necessary to register all farm nodes.
Discuss with your ROC or the grid deployment team a suitable layout for your site. Various configurations are possible. Experience has shown that using at the beginning a standardized small setup and evolve from this to a larger more complex system is highly advisable. Typical layout for a minimal site is a user interface node (UI) which allows to submit jobs to the grid. This node will use the information system and resource broker either from the ROC or CIC site, or the CERN site. A site that can provide resources will add a computing element (CE), that acts as a gateway to the computing resources and a storage element (SE), that acts as a gateway to the local storage. In addition a few worker nodes (WN) to provide the computing power can be added. Smaller sites will most likely add the RGMA monitoring node functionality to their SE, while medium to large sites should add a separate node as the MON node.
Large sites with many users that submit a large number of jobs will add a resource broker (RB). The resource broker distributes the jobs to the sites that are available to run jobs and keeps track of the status of the jobs. The RB uses for the resource discovery an information index (BDII). It is good practice to setup a BDII on each site that operates a RB. A complete site will add a Proxy server node that allows the renewal of proxy certificates.
In case you don't find a setup described in this installation guide that meets your needs you should contact your ROC for further help. Another place to look for alternative configurations is the administration FAQs at http://goc.grid.sinica.edu.tw/gocwiki/FrontPage .
The process to add additional VOs is described in the installation guides. The steps involved in adding a new VO are described on this web page: http://grid-deployment.web.cern.ch/grid-deployment/cgi-bin/index.cgi?var=gis/vo-deploy. In addition sites that support additional VOs have to add these VOs to their configuration files.The procedure to setup a file catalogue service for a new VO is described on the gocwiki page http://goc.grid.sinica.edu.tw/gocwiki/FrontPage .
If you want to use the grid as a user you are currently reading the wrong document. Please go the EGEE NA4 page and get into contact with a VO. To learn more about using LCG you can follow the steps described in the LCG User Overview (http://lcg.web.cern.ch/LCG/peb/grid_deployment/user_intro.htm ). The registration and initial training using the LCG-2 Users Guide (https://edms.cern.ch/file/454439//LCG-2-Userguide.pdf ) should take about a week. However only 8 hours is related to working with the system, while the majority is waiting for the registration process with the VOs and the CA.
On the LCG user introduction page (http://lcg.web.cern.ch/LCG/peb/grid_deployment/user_intro.htm ) you can find information on the current appropriate way to report problems. Always report problems first to your ROC. Many problems are currently reported to the rollout list. Internally we still use a Savannah based bug tracking tool that can be accessed via this link https://savannah.cern.ch/bugs/?group=lcgoperation .
The current software requires outgoing network access from all the nodes. And incoming on the RB, CE, and SE and the MyProxy server.
Some sites have gained experience with running their sites through a NAT and using dual network interfaces on the service nodes. The ROC in Italy has compiled some information about this. Please contact them for details.
To configure your firewall you should use the port table that we provide as a reference. Please have a look at the chapter on firewall configuration.
While we provide in our repositories Kernel RPMs and use for the configuration certain versions it has to be pointed out that you have to make sure that you consider the kernel that you install as safe. If the provided default is not what you want please replace it.
We expect site manager to be aware of the relevant security related policies of LCG. A page that summarises this information has been prepared and can be accessed under: http://proj-lcg-security.web.cern.ch/proj-lcg-security/sites/for_sites.htm .
|[D0]||EGEE Project Hoempage:|
|[D1]||LCG Project Homepage:|
|[D2]||Starting point for users of the LCG infrastructure:|
|[D3]||LCG-2 User's Guide:|
|[D6]||LCG GOC Mainpage:|
If your LCG nodes are behind a firewall, you will have to ask your network manager to open a few ``holes'' to allow external access to some LCG service nodes.
A complete map of which port has to be accessible for each service node is provided in file lcg-port-table.pdf in the lcg2/docs directory. http://lcgdeploy.cvs.cern.ch/cgi-bin/lcgdeploy.cgi/lcg2/docs/lcg-port-table.pdf .
If possible don't allow ssh access to your nodes from outside your site.
Your queues lengths are currently adjusted to CPU time and wall clock time. To allow the users a proper match between the local resources and their jobs, some care has to be taken to configure in the information system of the CE the parameters that describe the speed of your nodes.
The whole issue is a bit complicated and we have put together the following as a guideline for selecting the right values. Since we can't set both values, SpecFloat and SpecInt, correctly we suggest to set the SpecFloat to 0.
The SpecInt value can be taken either from http://www.specbench.org/osg/cpu2000/results/cint2000.html , or from this short list:
SI2K P4 2.4 GHz 852 P3 1.0 GHz 461 P3 0.8 GHz 340 P3 0.6 GHz 270Please note that some of the HEP experiments run very long jobs. If you support them your longest queue should be able to handle 48 hours jobs on a node correspondin to a 1GHz PIV,
This has been provided by David Kant <D.Kant@rl.ac.uk>
The GOC will be responsible for monitoring the grid services deployed through the LCG middleware at your site.
Information about the site is managed by the local site administrator. The information we require are the site contact details, list of nodes and IP addresses, and the middleware deplyed on those machines (EDG, LCG1, LCG2 etc)
Access to the database is done through a web browser (https) via the use of an X.509 certificate issued by a trusted LCG CA .
GOC monitoring is done hourly and begins with an SQL query of the database to extract your site details. Therfore, it is imoprtant to ensure that the information in the database is ACCURATE and UP-TO-DATE.
To request access to the database, load your certificate into your browser and go to:
The GOC team will then create a customised page for your site and give you access rights to these pages. This process should take less than a day and you will receive an email confirmation. Finally, you can enter your site details:
The GOC monitoring pages displaying current status information about LCG2:
This document was generated using the LaTeX2HTML translator Version 2002 (1.62)
Copyright © 1993, 1994, 1995, 1996,
Computer Based Learning Unit, University of Leeds.
Copyright © 1997, 1998, 1999, Ross Moore, Mathematics Department, Macquarie University, Sydney.
The command line arguments were:
latex2html -split 0 -html_version 4.0 -no_navigation -address 'GRID deployment' LCG2-Site-Setup.drv_html
The translation was initiated by Oliver KEEBLE on 2006-01-31