Document identifier: | LCG-GIS-MI |
---|---|
Date: | 23 November 2006 |
Author: | Guillermo Diez-Andino, Laurence Field, Oliver Keeble, Antonio Retico, Alessandro Usai, Louis Poncet |
Version: | v3.0.0 |
New versions of this document will be distributed synchronously with the
middleware releases and they will contain the current
``state-of-art'' of the installation and configuration procedures.
A dual document with the upgrade procedures to manually update the
configuration of the nodes from the previous LCG/gLite version to the current one is
also part of the release.
Since the release LCG-2_3_0, the manual installation and configuration
of nodes is supported by a set of scripts.
Nevertheless, the automatic configuration for some particular node types has
been intentionally left not covered.
The ``supported'' node types are:
of the specific systems (DPM, LFC, dCache...), you should visit page http://goc.grid.sinica.edu.tw/gocwiki/SystemsConfigGuides
http://www.scientificlinux.org
The site where the sources, and the images (iso) to create the CDs can be
found is
ftp://ftp.scientificlinux.org/linux/scientific/30x/iso/
Most middleware testing has been carried out on CERN Scientific Linux 3 (SLC3)
but should run on any binary compatible distribution.
Use the latest ntp version available for your system. If you are using APT, an apt-get install ntp will do the work.
restrict <time_server_IP_address> mask 255.255.255.255 nomodify notrap noquery server <time_server_name>Additional time servers can be added for better performance results. For each server, the hostname and IP address are required. Then, for each time-server you are using, add a couple of lines similar to the ones shown above into the file /etc/ntp.conf.
137.138.16.69 137.138.17.69
If you are using iptables, you can add the following to /etc/sysconfig/iptables
-A INPUT -s <NTP-serverIP-1> -p udp --dport 123 -j ACCEPT -A INPUT -s <NTP-serverIP-2> -p udp --dport 123 -j ACCEPT
Remember that, in the provided examples, rules are parsed in order, so ensure that there are no matching REJECT lines preceding those that you add. You can then reload the firewall
> /etc/init.d/iptables restart
> ntpdate <your ntp server name> > service ntpd start > chkconfig ntpd on
> ntpq -p
Where yaim is configuring a gLite node type, it populates the XML files and runs the gLite config scripts. Please note that any modifications you make to the XML files, to parameters not managed by yaim, should be {bf preserved. Parameters managed by yaim will be clearly marked in the XML after it has been run. The intention is that yaim offers a simple interface if prefered, but the ability to use the more powerful native machanism is retained.
Please use yaim to configure pool accounts. Yaim allows non-contiguous ranges of uids which some sites require and is therefore the default user configuration mechanism.
From now on we will refer to the node to be installed as the
target node
In order to work with the yaim installation and configuration tool
yaim must be installed on the target node.
In order to download yaim:
> wget http://www.cern.ch/grid-deployment/gis/yaim/glite-yaim-x.x.x-x.noarch.rpm
> rpm -ivh glite-yaim-x.x.x-x.noarch.rpm
WARNING: The Site Configuration File is sourced by the configuration
scripts. Therefore there must be no spaces around the equal sign.
Example of wrong configuration:
SITE_NAME = my-siteExample of correct configuration:
SITE_NAME=my-siteA good syntax test for your Site Configuration file (e.g. my-site-info.def) is to try and source it manually, running the command
> source my-site-info.defand checking that no error messages are produced.
The complete specification of the configurable variables follows. Remember that in the 'examples' directory of yaim there is a commented example site-info.def file which can give further explanation about these variables.
If you need to configure a limited set of nodes, you will not need to supply values for all these variables.
VOs can advertise their parameters via the YAIM VO Configurator;
https://lcg-sft.cern.ch/yaimtool/yaimtool.py
Site administrators can use this utility to maintain a list of the VOs their site supports and to automatically generate the appropriate YAIM fragment.
> wget ftp://ftp.scientificlinux.org/linux/scientific/30x/i386/SL/RPMS/apt-XXX.i386.rpm
> rpm -ivh apt-XXX.i386.rpm
SL3:
LCG_REPOSITORY="rpm http://glitesoft.cern.ch/EGEE/gLite/APT/R3.0/ rhel30 externals Release3.0 updates"
Please note that for the dependencies of the middleware to be met, you'll have to make sure that apt
can find and download your OS rpms. This typically means you'll have to install an rpm called
'apt-sourceslist', or else create an appropriate file in your /etc/apt/sources.list.d directory.
If you are not using SLC3 but another OS binary compatible distribution is highly recommended that you configure apt-get in order to give priority, during the installation, to packages listed within your distribution.
In order to have all the known dependencies possibly solved by apt-get you should have at least the following lists in your /etc/apt/sources.list.d/:
Since the deployment team is based at CERN and it uses the local installation, it is still possible that with this bare configuration, some dependencies, though dealt with, cannot be solved because the binary compatible distribution you use does not provide the entire set of packages which CERN SL3 does.
If you prefer not to handle these issues manually you could add in the /etc/apt/sources.list.d/ another list (e.g. cern.list)
### List of available apt repositories available from linuxsoft.cern.ch ### suitable for your system. ### ### See http://cern.ch/linux/updates/ for a list of other repositories and mirrors. ### 09.06.2004 ### # THE default rpm http://linuxsoft.cern.ch cern/slc30X/i386/apt os updates extras rpm-src http://linuxsoft.cern.ch cern/slc30X/i386/apt os updates extras
Then you have to configure your apt-get preferences in order to give priority to your Os and not to CERN SLC3.
A /etc/apt/preferences file like the following one will give priority to your Os in any case except when the package that you need is not present in your-os repository :
Package: * Pin: release o=your-os.your-domain.org Pin-Priority: 980 Package: * Pin: release o=linux.cern.ch Pin-Priority: 970
If you are not using apt to install, you can pull the packages directly from SLC3's repository using wget. The address is http://linuxsoft.cern.ch/cern/slc305/i386/apt/.
In order to install the node with the desired middleware packages run the command
> /opt/glite/yaim/scripts/install_node <site-configuration-file> <meta-package> [ <meta-package> ... ]
The complete list of the available meta-packages available with this release is
provided in 8.1.(SL3)
For example, in order to install a CE with Torque, after the configuration of the site-info.def file is done, you have to run:
> /opt/glite/yaim/scripts/install_node site-info.def lcg-CE_torque
WARNING: There is a known installation conflict between the 'torque-clients'
rpm and the 'postfix' mail server (Savannah. bug #5509).
In order to workaround the problem you can either uninstall postfix or remove
the file
/usr/share/man/man8/qmgr.8.gz from the target node.
The ``bare-middleware'' versions of the WN and CE meta-packages are provided in case you have an existing LRMS;
> /opt/glite/yaim/scripts/install_node site-info.def lcg-CE
You can install multiple node types on one machine
> /opt/glite/yaim/scripts/install_node site-info.def <meta-package> <meta-package> ...
Node Type | meta-package Name | meta-package Description |
gLite WMS and LB | glite-WMSLB | Combined WMS LB node |
glite CE | glite-CE | The gLite Computing Element |
FTS | glite-FTS | gLite File Transfer Server |
FTA | glite-FTA | gLite File Transfer Agent |
BDII | glite-BDII | BDII |
LCG Computing Element (middleware only) | lcg-CE | It does not include any LRMS |
LCG Computing Element (with Torque) | lcg-CE_torque | It includes the 'Torque' LRMS |
LCG File Catalog (mysql) | glite-LFC_mysql | LCG File Catalog |
LCG File Catalog (oracle) | glite-LFC_oracle | LCG File Catalog |
MON-Box | glite-MON | RGMA-based monitoring system collector server |
MON-Box | glite-MON_e2emonit | MON plus e2emonit |
Proxy | glite-PX | Proxy Server |
Resource Broker | lcg-RB | Resource Broker |
Classic Storage Element | glite-SE_classic | Storage Element on local disk |
dCache Storage Element | glite-SE_dcache | Storage Element interfaced to dCache without pnfs dependency |
dCache Storage Element | glite-SE_dcache_gdbm | Storage Element interfaced to dCache with dependency on pnfs (gdbm) |
DPM Storage Element (mysql) | glite-SE_dpm_mysql | Storage Element with SRM interface |
DPM Storage Element (Oracle) | glite-SE_dpm_oracle | Storage Element with SRM interface |
DPM disk | glite-SE_dpm_disk | Disk server for a DPM SE |
Dependencies for the re-locatable distribution | glite-TAR | This package can be used to satisfy the dependencies of the relocatable distro |
User Interface | glite-UI | User Interface |
VO agent box | glite-VOBOX | Agents and Daemons © |
Worker Node (middleware only) | glite-WN | It does not include any LRMS |
> apt-get update && apt-get -y install lcg-CA
In order to keep the CA configuration up-to-date on your node we strongly recommend Site Administrators to program a periodic upgrade procedure of the CA on the installed node (e.g. running the above command via a daily cron job).
All nodes except UI, WN and BDII require the host certificate/key files before you start their installation.
Contact your national Certification Authority (CA) to understand how to
obtain a host certificate if you do not have one already.
Instruction to obtain a CA list can be found in
http://grid-deployment.web.cern.ch/grid-deployment/lcg2CAlist.html
From the CA list so obtained you should choose a CA close to you.
Once you have obtained a valid certificate, i.e. a file
/etc/grid-security
The general procedure to configure the middleware packages that have been installed on the node via the procedure described in 8., is to run the command:
> /opt/glite/yaim/scripts/configure_node <site-configuration-file> <node-type> [ <node-type> ... ]For example, in order to configure the WN with Torque you had installed before, after the configuration of the site-info.def file is done, you have to run:
> /opt/glite/yaim/scripts/configure_node site-info.def WN_torque
In the following paragraph a reference to all the available
configuration scripts is given.
Node Type | Node Type | Node Description |
gLite WMS and LB | WMSLB | Combined WMS LB node |
glite CE | gliteCE | The gLite Computing Element |
FTS | FTS | gLite File Transfer Server |
FTA | FTA | gLite File Transfer Agent |
BDII | BDII | A top level BDII |
Computing Element (middleware only) | CE | It does not configure any LRMS |
Computing Element (with Torque) * | CE_torque | It configures also the 'Torque' LRMS client and server (see 12.6. for details) |
LCG File Catalog server * | LFC_mysql | Set up a mysql based LFC server |
MON-Box | MON | RGMA-based monitoring system collector server |
e2emonit | E2EMONIT | RGMA-based monitoring system collector server |
Proxy | PX | Proxy Server |
Resource Broker | RB | Resource Broker |
Classic Storage Element | SE_classic | Storage Element on local disk |
Disk Pool Manager (mysql) * | SE_dpm_mysql | Storage Element with SRM interface and mysql backend |
Disk Pool Manager disk * | SE_dpm_disk | Disk server for SE_dpm |
dCache Storage Element | SE_dcache | Storage Element interfaced with dCache |
Re-locatable distribution * | TAR_UI or TAR_WN | It can be used to set up a Worker Node or a UI (see 12.9. for details) |
User Interface | UI | User Interface |
VO agent box | VOBOX | Machine to run VO agents |
Worker Node (middleware only) | WN | It does not configure any LRMS |
Worker Node (with Torque client) | WN_torque | It configures also the 'Torque' LRMS client |
You can use yaim to install more than one node type on a single machine. In this case, you should install all the relevant software first, and then run the configure script. For example, to install a combined RB and BDII, you should do the following;
> /opt/glite/yaim/scripts/install_node site-info.def RB BDII > /opt/glite/yaim/scripts/configure_node site-info.def RB BDII
All node-types must be given as arguments to the same invocation of configure_node - do not run this command once for each node type. Note that combinations known not to work are the CE/RB, RB/SE, CE/BDII.
In this section we list configuration steps actually needed to complete the
configuration of the desired node but not supported by the automatic
configuration scripts.
If a given node does not appear in that section it means that its
configuration is complete
install_node site-info.def glite-WMSLB configure_node site-info.def WMSLB
Normal gLite CE configuration has no site BDII;
install_node site-info.def glite-CE configure_node site-info.def gliteCE
If you want your gliteCE to run the site BDII;
configure_node site-info.def gliteCE BDII_site
Due to a yaim bug (fixed in later versions) you'll have to add
BDII_site_FUNCTIONS=$BDII_FUNCTIONS
in yaim's scripts/node-info.def (after the BDII_FUNCTIONS definition) for this to work.
BDII_site is just a mechanism to allow a gLiteCE to run a site level BDII, it is not a standard configuration target or node type (there is no associated meta-rpm).
Place a file containing the following in $LCG_LOCATION/var/gip/ldif
dn: GlueSiteUniqueID=<YourSiteName>,mds-vo-name=local,o=grid objectClass: GlueTop objectClass: GlueSite objectClass: GlueKey objectClass: GlueSchemaVersion GlueSiteUniqueID: <YourSiteName> GlueSiteName: <YourSiteame> GlueSiteDescription: LCG Site GlueSiteUserSupportContact: mailto: <YourserSupportContactMail> GlueSiteSysAdminContact: mailto: <YourSysAdminContactMail> GlueSiteSecurityContact: mailto: <YourSecurityContactMail> GlueSiteLocation: <City>, <Country> GlueSiteLatitude: <SiteLatitude> GlueSiteLongitude: <SiteLongitude> GlueSiteWeb: <YourSiteWeb> GlueSiteSponsor: none GlueSiteOtherInfo: TIER 2 GlueSiteOtherInfo: my-bigger-site.domain GlueForeignKey: GlueSiteUniqueID=<YourSiteName> GlueSchemaVersionMajor: 1 GlueSchemaVersionMinor: 2
The glite-CE configuration configures also software and scheduler GIP plugins. Due to the bug in the /opt/lcg/libexec/lcg-info-dynamic-scheduler file the following command must be run in order to get a correct functionality:
# sed -i '{s/jobmanager/blah/}' /opt/lcg/libexec/lcg-info-dynamic-scheduler
If you are installing your batch system server on the same node as the CE, and you want to use yaim or gLite to configure it, please choose one or the other and stick to it. If you use yaim and then make modifications via the gLite system, any rerun of yaim will reset the configuration. The same advice applies to management of WNs. If yaim fulfils your needs, this is the recommended route.
install_node site-info.def glite-CE glite-torque-server-config configure_node site-info.def gliteCE TORQUE_server
TORQUE_server is a configuration target provided to help configure torque with the gliteCE or on a separate machine. There is no directly associated meta-rpm, but please use glite-torque-server-config to combine with the gliteCE (as illustrated above).
/opt/glite/bin/BLParserPBS -p 33332 -s /var/spool/pbsBLParserPBS is from glite-ce-blahp
Note that the log-parser daemon must be started on whichever node is running the batch system. If your CE node is also the batch system head node, you have to run the log-parser here.
If you are running two CEs (typically LCG and gLite versions) please take care to ensure no collisions of pool account mapping. This is typically achieved either by allocating separate pool account ranges to each CE or by allowing them to share a gridmapdir.
Information on running with LSF can be found here https://uimon.cern.ch/twiki/bin/view/LCG/LSFCeExtraSteps
Information on Sun Grid Engine can be found here http://goc.grid.sinica.edu.tw/gocwiki/How_to_run_an_SGE_farm_on_LCG
You can add E2EMONIT to your MON box like this
install_node site-info.def glite-MON_e2emonit configure_node site-info.def MON E2EMONIT
To install the glite-WN + Torque client
install_node site-info.def glite-WN glite-torque-client-config configure_node site-info.def WN_torque
WARNING: in the CE configuration context (and also in the 'torque' LRMS one),
a file with a a list of managed nodes needs to be compiled. An example of this
configuration file is given in /opt/glite/yaim/examples/wn-list.conf
Then the file path needs to be pointed by the variable WN_LIST in the
Site Configuration File (see 6.1.).
The Maui scheduler configuration provided with the script is currently very
basic.
More advanced configuration examples, to be implemented manually by Site Administrators can be found in [6]
There is still a manual step required in configuring FTS
https://uimon.cern.ch/twiki/bin/view/LCG/FtsServerInstall15
At the present time, the FTS requires a different proxy server to that used by the broker. Please ensure this restriction is respected in the site-info.def file you use to configure the File Transfer Server.
Please see the FTS install guides for more information
https://uimon.cern.ch/twiki/bin/view/LCG/FtsRelease15
https://uimon.cern.ch/twiki/bin/view/LCG/FtsServerInstall15
Yaim does not yet support d-Cache with a postgresql based pnfs. To accommodate sites who have already upgraded to this version of pnfs, we now have two types of d-Cache SE.
glite-SE_dcache
This has no dependency on pnfs at all, so upgrades of either type (postgresql or gdbm) should work at the rpm level.
glite-SE_dcache_gdbm
This has a dependency on pnfs (ie the gdbm version) and is necessary for a new install. Please note however that pnfs_postgresql is the preferred implementation and migration is non trivial.
Once you have the middleware directory available, you must edit the site-info.def file as usual, putting the location of the middleware into the variable INSTALL_ROOT.
If you are sharing the distribution to a number of nodes, commonly WNs, then they should all mount the tree at INSTALL_ROOT. You should configure the middleware on one node (remember you'll need to mount with appropriate privileges) and then it should work for all the others if you set up your batch system and the CA certificates in the usual way. If you'd rather have the CAs on your share, the yaim function install_certs_userland may be of interest. You may want to mount your share ro after the configuration has been done.
The middleware in the relocatable distribution has certain dependencies.
We've made this software available as a second tar file which you can download and untar under $INSTALL_ROOT. This means that if you untarred the main distribution under /opt/LCG, you must untar the supplementary files in the same place. Please note that in earlier distributions the deps were untarred elsewhere.
If you have administrative access to the nodes, you could alternatively use the TAR dependencies rpm.
> /opt/glite/yaim/scripts/install_node site-info.def glite-TAR
For Debian, here is a list of packages which are required for the tarball to work
perl-modules python2.2 libexpat1 libx11-6 libglib2.0-0 libldap2 libstdc++2.10-glibc2.2 tcl8.3-dev libxml2 termcap-compat libssl0.9.7 tcsh rpm rsync cpp gawk openssl wget
Run the configure_node script, adding the type of node as an argument;
> /opt/glite/yaim/scripts/configure_node site-info.def [ TAR_WN | TAR_UI ]
Note that the script will not configure any LRMS. If you're configuring torque for the first time, you may find the config_users and config_torque_client yaim functions useful. These can be invoked like this
${INSTALL_ROOT}/glite/yaim/scripts/run_function site-info.def config_users ${INSTALL_ROOT}/glite/yaim/scripts/run_function site-info.def config_torque_client
You can find a quick guide to this here [5].
If you don't have root access, you can use the supplementary tarball mentioned above to ensure that the dependencies of the middleware are satisfied. The middleware requires java (see 3.), which you can install in your home directory if it's not already available. Please make sure you set the JAVA_LOCATION variable in your site-info.def. You'll probably want to alter the OUTPUT_STORAGE variable there too, as it's set to /tmp/jobOutput by default and it may be better pointing at your home directory somewhere.
Once the software is all unpacked, you should run
> $INSTALL_ROOT/glite/yaim/scripts/configure_node site-info.def TAR_UIto configure it.
Finally, you'll have to set up some way of sourcing the environment necessary to run the grid software. A script will be available under $INSTALL_ROOT/etc/profile.d for this purpose. Source grid_env.sh or grid_env.csh depending upon your choice of shell.
Installing a UI this way puts all the CA certificates under $INSTALL_ROOT/etc/grid-security and adds a user cron job to download the crls. However, please note that you'll need to keep the CA certificates up to date yourself. You can do this by running
> /opt/glite/yaim/scripts/run_function site-info.def install_certs_userland
In [3] there is more information on using this form of the distribution. You should check this reference if you'd like to customise the relocatable distribution.
This distribution is used at CERN to make its lxplus system available as a UI. You can take a look at the docs for this too [4].
You can download the latest gliteUI_WN-3.x.y-z.tar.gz and gliteUI_WN-3.x.y-z-userdeps.tar.gz tar files from
http://grid-deployment.web.cern.ch/grid-deployment/download/relocatable/
http://egee-sa1.web.cern.ch/egee-sa1/ROC-support.htm
version | date | description |
v2.5.0-1 | 17/Jul/05 | Removing Rh 7.3 support completely. |
v2.3.0-2 | 10/Jan/05 | 6.1.: CA_WGET variable added in site configuration file. |
v2.3.0-3 | 2/Feb/05 | Bibliography: Link to Generic Configuration Reference changed. |
" | " | 12.6., 6.1.: Details added on WN and users lists. |
" | " | script ``configure_torque''. no more available: removed from the list. |
v2.3.0-4 | 16/Feb/05 | Configure apt to find your OS rpms. |
v2.3.0-5 | 22/Feb/05 | Remove apt prefs stuff, mention multiple nodes on one box. |
v2.3.0-6 | 03/Mar/05 | Better lcg-CA update advice. |
v2.3.1-1 | 03/Mar/05 | LCG-2_3_1 locations |
v2.3.4-0 | 01/Apr/05 | LCG-2_4_4 locations |
v2.3.4-1 | 08/Apr/05 | external variables section inserted |
v2.3.4-2 | 31/May/05 | 4.: fix in firewall configuration |
" | " | 11.: verbatim line fixed |
v2.5.0-0 | 20/Jun/05 | 6.1.: New variables added |
" | " | 11.1.: New nodes added (dpm) |
v2.5.0-1 | 28/Jun/05 | 7.: note on apt-get preferences added |
v2.6.0-1 | 23/Sep/05 | 10.: host certificates needed on the VOBOX |
v2.7.0-0 | 11/Jan/06 | 6.1.: new variables GROUPS_CONF RB_RLS BATCH_LOG_DIR CLASSIC_HOST CLASSIC_STORAGE_DIR DPMPOOL_NODES SE_LIST BDII_FCR_URL VO_XXX_VOMSES added. |
v2.7.0-1 | 10/Feb/06 | 11.1.: all the forbidden combinations added |
v3.0.0-1 | 28/Apr/06 | Updated for gLite 3.0 |
This document was generated using the LaTeX2HTML translator Version 2002-2-1 (1.70)
Copyright © 1993, 1994, 1995, 1996,
Nikos Drakos,
Computer Based Learning Unit, University of Leeds.
Copyright © 1997, 1998, 1999,
Ross Moore,
Mathematics Department, Macquarie University, Sydney.
The command line arguments were:
latex2html -split 0 -html_version 4.0 -no_navigation -address 'GRID deployment' LCG2-Manual-Install.drv_html
The translation was initiated by Oliver KEEBLE on 2006-11-23