Tracing logs in the RB and the CE.

A job in LCG is identified by the EDG jobid (or simply jobid). With the command edg-job-get-logging-info  -v  1  <jobid>, it is possible to retrieve some logging information about the job, but in some cases it is necessary to investigate what happened with the job within the CE in more depth. In order to do that, an identifier for the job within the CE must be provided to the CE's site admin, so that she can identify which job we are referring to.

This identifier is called the JM-contact string. It is not returned by the edg-job-get-logging-info command, but it can be obtained from the RB logging files. The procedure to do so is described as follows:

  1. In the logging information, look for the RB's logging file.
    Something like /var/edgwl/logmonitor/CondorG.log/CondorG.xxx.log (search for "CondorG.log")
  2. Retrieve that file from the RB:
    globus-url-copy  gsiftp://<hname>/var/edgwl/logmonitor/CondorG.log/CondorG.xxx.log file:`pwd`/RB_log
  3. In the retrieved file, look for the last part of the jobid.
    For https://lxn1182.cern.ch:9000/HAZrszS8RzlM82MlAqrAIw, it'd be: HAZrszS8RzlM82MlAqrAIw
  4. Get the string of the form xxxxx.0000.0000 associated with that part of the jobid.
  5. In the same retrieved file, search repeatedly for that string until the JM-contact string (labelled as such) is found.
  6. When telling the site about the job, refer to both the EDG jobid and the JM-contact string.
All this can be done automatically with a PERL script (it includes a descrition of the arguments and functionality): getJM.pl


Getting the BDII configuration of the RB.

In LCG2, the information about resources comes from the BDII. In the UI and the WNs where the jobs run, this is defined by the environmental variable LCG_GFAL_INFOSYS (it will not necessarily have the same value in both, unless we set the it in the job's JDL).

Nevertheless, the Resource Broker may use yet another different BDII, and that determines which CEs are seen when doing the match making process. The following recipe can be used to find out which BDII our RB uses.

  1. To find out which RB is being used by default in a UI, look at: $EDG_WL_LOCATION/etc/<VO>/edg_wl_ui.conf
    Take the hostname appearing as value of the attribute NSAddresses
  2. Do a globus-url-copy of the following configuration file: /opt/edg/etc/edg_wl.conf
    globus-url-copy  gsiftp://<hname>/opt/edg/etc/edg_wl.conf  file:`pwd`/RB_conf
  3. In the retrieved file, look for the value of the II_Contact parameter.

Debugging authentication errors.

First, let us review the main issues to take care of to avoid authentication problems:

And when an authentication error occurs...