Exercise 8: Direct submission to CREAM CE: Difference between revisions

From Gridkaschool
Jump to navigationJump to search
No edit summary
No edit summary
 
(7 intermediate revisions by the same user not shown)
Line 3: Line 3:
* check user certificate
* check user certificate
ls -al .globus
ls -al .globus
ls -al .glite
* check available queues and batch system
* check available queues and batch system
ldapsearch -x -H ldap://gks-XYZ.fzk.de:2170 -b mds-vo-name=resource,o=grid
ldapsearch -x -H ldap://gks-XYZ.fzk.de:2170 -b mds-vo-name=resource,o=grid
* create a user proxy
* create a user proxy
voms-proxy-init -voms dech * get the test jdl file
voms-proxy-init -voms dech
* get the test jdl file : [[Media:test.jdl | test.jdl]]
* submit the job
* submit the job
glite-ce-job-submit -d -r <ce_host_FQDN>:8443/cream-pbs-cert -a test.jdl
glite-ce-job-submit -d -r <ce_host_FQDN>:8443/cream-pbs-cert -a test.jdl
Line 14: Line 16:
glite-ce-job-output <cream-id>
glite-ce-job-output <cream-id>


==[[Troubleshooting]]==
Needed file: [[Media:test.jdl | test.jdl]]
At the beginning jobs are stuck in REALLY-RUNNING state.

1. Checking in the worker node
ps -ef
a row like
dech021 5591 5586 0 16:12 ? 00:00:00 /bin/sh -l ./CREAM396012076_jobWrapper.sh
will appear in the output. Checking in job output and error in the job directory
cd /home/dech021//home_cream_396012076
ls
CREAM396012076 CREAM396012076_jobWrapper.sh cream_396012076.proxy err_cream_396012076_StandardError out_cream_396012076_StandardOutput
we find an error in CREAM396012076_jobWrapper.sh line 80, and checking, we discover /usr/bin/glite-lb-logevent is missing. To have it installed:
yum install glite-lb-client-progs

2. On the ce start globus-gridftp-server
/etc/init.d/globus-gridftp-server start

Latest revision as of 17:09, 28 August 2012

  • Connect to the user interface
 ssh -p 24 -l gksXYZ gks-011.scc.kit.edu 
  • check user certificate
 ls -al .globus 
 ls -al .glite
  • check available queues and batch system
 ldapsearch -x -H ldap://gks-XYZ.fzk.de:2170 -b mds-vo-name=resource,o=grid  
  • create a user proxy
 voms-proxy-init -voms dech  
  • get the test jdl file : test.jdl
  • submit the job
 glite-ce-job-submit -d -r <ce_host_FQDN>:8443/cream-pbs-cert -a test.jdl  
  • check the job status
 glite-ce-job-status <cream-id>
  • get the job output
 glite-ce-job-output <cream-id>

Troubleshooting

At the beginning jobs are stuck in REALLY-RUNNING state.

1. Checking in the worker node

 ps -ef

a row like

     dech021   5591  5586  0 16:12 ?        00:00:00 /bin/sh -l ./CREAM396012076_jobWrapper.sh

will appear in the output. Checking in job output and error in the job directory

 cd /home/dech021//home_cream_396012076
 ls
     CREAM396012076  CREAM396012076_jobWrapper.sh  cream_396012076.proxy  err_cream_396012076_StandardError  out_cream_396012076_StandardOutput

we find an error in CREAM396012076_jobWrapper.sh line 80, and checking, we discover /usr/bin/glite-lb-logevent is missing. To have it installed:

 yum install glite-lb-client-progs

2. On the ce start globus-gridftp-server

 /etc/init.d/globus-gridftp-server start