Configuration of the CREAM CE

From Gridkaschool

site-info.def

Now get your site-info.def from your site BDII or start with the example file /opt/glite/yaim/examples/siteinfo/site-info.def and save it somewhere you like, e.g. /root/yaim/ or /opt/glite/yaim/ and make sure that it is only readable by root.

mkdir /root/yaim
cd /root/yaim
scp gks-X-XYZ:/root/yaim/site-info.def /root/yaim
chmod 600 /root/yaim

users.conf

For every supported virtual organization you need to define users in the users.conf file. For details please go to users.conf.

groups.conf

For every group of a virtual organization you are supporting you need to define it in the groups.conf file. For details please refer to groups.conf.

wn-list.conf

For the TORQUE and the CREAM Compute Element you need a file where all your worker nodes are listed. By default the file should be stored at /opt/glite/yaim/etc/wn-list.conf. In this course you have only one host available for a worker node, so create a file /opt/glite/yaim/etc/wn-list.conf and add its full hostname.

hostkeys

As for most grid services you need a host certificate for the CREAM Compute Element with the right permissions. For more details please refer to Host certificates.

Variables

Before the configuration script can be run the site-info.def has to be adapted. A good overview on the parameters available can be found in the LCG Wiki at CERN. Required Variables are:

SITE_NAME=GKSXYZ
CE_HOST="gks-X-XYZ.fzk.de"
BATCH_SERVER="gks-X-XYZ.fzk.de"

USERS_CONF=/opt/glite/yaim/etc/users.conf
GROUPS_CONF=/opt/glite/yaim/etc/groups.conf
WN_LIST=/opt/glite/yaim/etc/wn-list.conf

JAVA_LOCATION=/usr/java/default

JOB_MANAGER=pbs
CE_BATCH_SYS=pbs
BATCH_VERSION=torque-2.3.6-2
BATCH_BIN_DIR=/usr/bin
BATCH_LOG_DIR=/var/spool/pbs/
BATCH_SPOOL_DIR=/var/spool/pbs/
CREAM_CE_STATE="Special"
BLPARSER_HOST=$CE_HOST
BLAH_JOBID_PREFIX="cream_"
BLPARSER_WITH_UPDATER_NOTIFIER="false"
MYSQL_PASSWORD=koala
APEL_DB_PASSWORD="APELDB_PWD" 

# MON_HOST=mon.example.org  # use this only if you still run a (depreceated) mon host
APEL_MYSQL_HOST="gks-X-XYZ"  # point to APEL host

CE_OS_ARCH=x86_64           # "uname -i" executed on the WN
CE_OS=ScientificSL          # "lsb_release -i | cut -f2" executed on the WN
CE_OS_RELEASE=5.4           # "lsb_release -r | cut -f2" executed on the WN
CE_OS_VERSION=Boron         # "lsb_release -c | cut -f2" executed on the WN

CE_CPU_MODEL=Opteron        # "grep 'model name' /proc/cpuinfo" executed on the WN - just take the main description (e.g. Opteron or Xeon)
CE_CPU_VENDOR=AMD           # "grep vendor /proc/cpuinfo" executed on the WN - just take the company name (e.g. AMD or Intel)
CE_CPU_SPEED=1800           # "grep MHz /proc/cpuinfo" executed on the WN
CE_MINPHYSMEM=1024          # "grep MemTotal /proc/meminfo" executed on the WN - attention: value is shown in kB, CE_MINPHYSMEM must be in MB!
CE_MINVIRTMEM=2048          # "grep SwapTotal /proc/meminfo" executed on the WN - attention: value is shown in kB, CE_MINVIRTMEM must be in MB!
CE_OUTBOUNDIP=TRUE          # do the WNs have permission for direct outbound connectivity?
CE_INBOUNDIP=FALSE          # do the WNs have permission for inbound connectivity?
CE_RUNTIMEENV="             # list of supported middleware/software
   GLITE-3_2_0
   R-GMA
 "
CE_CAPABILITY="none"        # if VO fairshares are defined in the batch system, they need to be reflected here, otherwise this variable must be none
CE_OTHERDESCR="Cores=4"     # number of cores per CPU for a typical WN ("grep -c "physical id.*: 0" /proc/cpuinfo")
CE_PHYSCPU=2                # total number of CPUs in the batch system ("grep -c "core id.*: 0" /proc/cpuinfo" * #WN)
CE_LOGCPU=1                 # number of logical CPUs or job slots in the batch cluster ("grep -c processor /proc/cpuinfo" * #WN)
CE_SMPSIZE=1                # number logical CPUs per Node ("grep -c processor /proc/cpuinfo")
CE_SI00=1592                # SpecInt value (see [1] for example values)
CE_SF00=1927                # SpecFP value (see [2] for example values)
CE_OTHER_DESCR="Cores=1, Benchmark=1.11-HEP-SPEC06"   # number of cores per CPU and Version of SPEC Benchmark

SE_LIST="gks-se.fzk.de"     # List of CloseSEs
SE_MOUNT_INFO_LIST="none"   # Mount info of the SE. If not supported by SE put "none" here.

VOS="dech"                  # supported VOs
QUEUES="test"               # The name of the queues defined in the Batch System
TEST_GROUP_ENABLE="dech"    # format of this parameter is: $QUEUE_GROUP_ENABLE e.g. for queue "test" and VO "dech" put TEST_GROUP_ENABLE="dech" here

ACCESS_BY_DOMAIN=false      # allow access to CREAM DB from other computers?
CREAM_DB_USER=creamdb       # Cream DB user name 
CREAM_DB_PASSWORD="secretPassword" # CREAM DB password 
BLPARSER_HOST=$CE_HOST      # Fully qualified name of machine hosting the BLAH blparser
BLP_PORT=33333              # Port where BLAH Blparser listens to 
CREAM_PORT=56565            # Port to access CREAM CE
CEMON_HOST=$CE_HOST         # Fully qualified name of CEMon host (do not use localhost !)

In addition some information for the supported virtual organizations have to be set. See VO configuration.

Configuration with yaim

NOTE: The function lists for the yaim configuration are located in /opt/glite/yaim/node-info.d/. You can have a look at the lists on the machines or in the wiki under category "Function lists". For the CREAM Compute Element configuration without a batch server you need the lists for glite-creamCE and glite-TORQUE_utils:

Functions used for the configuration of the CREAM Compute Element: Function list CREAM CE
Functions used for the configuration of the Torque utils: Function list TORQUE_utils

The accounting service running on the CREAM compute element will periodically check for new data in the directory /var/spool/pbs/server_priv/accounting. In our setup, this directory does not exist on the CREAM CE, but at the batch system server and is also filled by it. To allow the proper operation of the accounting service on the CREAM CE, we need to export this directory from the batch system server to the compute element.

  • Create the accounting directory on the CE and mount it from the batch system
mkdir -p /var/spool/pbs/server_priv/accounting
mount 141.52.174.XYZ:/var/spool/pbs/server_priv/accounting /var/spool/pbs/server_priv/accounting
  • If mounting the directory works, it needs to be included in the fstab. Add the following line to /etc/fstab
141.52.174.XYZ:/var/spool/pbs/server_priv/accounting /var/spool/pbs/server_priv/accounting nfs rw,soft 0 0

Similar steps are required for another service located at the CREAM compute element. The BLParser service is responsible for parsing the batch system server log files. By parsing the log files, the BLParser gains information about the status (queued, running, done, aborted,...) of the various grid jobs in the batch system. For proper operation you need to mount the log files from the batch system server to the compute element.

mount 141.52.174.XYZ:/var/spool/pbs/server_logs /var/spool/pbs/server_logs # on the CE command line

or alternatively

141.52.174.XYZ:/var/spool/pbs/server_logs /var/spool/pbs/server_logs nfs rw,soft 0 0 # in the CE /etc/fstab
mount -a # on the command line on the CE

Now call yaim for the node type creamCE and TORQUE_utils:

/opt/glite/yaim/bin/yaim -c -s /root/yaim/site-info.def -n glite-creamCE -n glite-TORQUE_utils

The output of yaim is listed below. Your output should look similar to it.

  INFO: Using site configuration file: /root/yaim/site-info.def
  INFO: 
        ###################################################################
        
        .             /'.-. ')
        .     yA,-"-,( ,m,:/ )   .oo.     oo    o      ooo  o.     .oo
        .    /      .-Y a  a Y-.     8. .8'    8'8.     8    8b   d'8
        .   /           ~ ~ /         8'    .8oo88.     8    8  8'  8
        . (_/         '===='          8    .8'     8.   8    8  Y   8
        .   Y,--,Yy,-.,/           o8o  o8o    o88o  o8o  o8o    o8o
        .    I_))_) I_))_)
        
        
        current working directory: /root/yaim
        site-info.def date: Sep 1 12:19 /root/yaim/site-info.def
        yaim command: -c -s /root/yaim/site-info.def -n glite-creamCE -n glite-TORQUE_utils
        log file: /opt/glite/yaim/bin/../log/yaimlog
        Wed Sep  1 12:19:58 CEST 2010 : /opt/glite/yaim/bin/yaim
        
        Installed YAIM versions:
        glite-yaim-core 4.0.12-1
        glite-yaim-cream-ce 4.1.0-14
        glite-yaim-torque-utils 4.0.4-1
        
        ####################################################################
  INFO: The default location of the grid-env.(c)sh files will be: /opt/glite/etc/profile.d
  INFO: Sourcing the utilities in /opt/glite/yaim/functions/utils
  INFO: Detecting environment
  INFO: Executing function: config_cream_stop_check 
  INFO: Executing function: config_cream_clean_check 
  INFO: Executing function: config_cream_db_check 
  INFO: Executing function: config_add_pool_env_check 
  INFO: Executing function: config_host_certs_check 
  INFO: Executing function: config_vomsdir_check 
  INFO: Executing function: config_vomses_check 
  INFO: Executing function: config_users_check 
  INFO: Executing function: config_edgusers_check 
  INFO: Executing function: config_cream_glexec_user_check 
  INFO: Executing function: config_cream_sudoers_check 
  INFO: Executing function: config_secure_tomcat_check 
  INFO: Executing function: config_vomsmap_check 
  INFO: Executing function: config_globus_clients_check 
  INFO: Executing function: config_rgma_client_check 
  INFO: Executing function: config_lcas_lcmaps_gt4_check 
  INFO: Executing function: config_globus_gridftp_check 
  INFO: Executing function: config_cream_glexec_check 
  INFO: Executing function: config_cream_blah_check 
  INFO: Executing function: config_cream_ce_check 
  INFO: Executing function: config_cream_logrotation_check 
  INFO: Executing function: config_cream_gip_check 
  INFO: Executing function: config_gip_scheduler_plugin_check 
  INFO: Executing function: config_gip_vo_tag_check 
  INFO: Executing function: config_info_service_cream_ce_check 
  INFO: Executing function: config_info_service_cemon_check 
  INFO: Executing function: config_cream_cemon_check 
  INFO: Executing function: config_cream_gliteservices_check 
  INFO: Executing function: config_cream_locallogger_check 
  INFO: Executing function: config_glite_locallogger_check 
  INFO: Executing function: config_maui_cfg_check 
  INFO: Executing function: config_apel_pbs_check 
  INFO: Executing function: config_gip_sched_plugin_pbs_check 
  INFO: Executing function: config_torque_submitter_ssh_check 
  INFO: Executing function: config_bdii_only_check 
  INFO: Executing function: config_cream_stop_setenv 
  INFO: Executing function: config_cream_stop 
  INFO: blah not running
  INFO: tomcat not running
  INFO: blah not running
  INFO: lb processes not running
  INFO: Executing function: config_cream_clean_setenv 
  INFO: Executing function: config_cream_clean 
  INFO: Executing function: config_cream_db_setenv 
  INFO: Executing function: config_cream_db 
Initializing MySQL database:  Installing MySQL system tables...
OK
Filling help tables...
OK

To start mysqld at boot time you have to copy
support-files/mysql.server to the right place for your system

PLEASE REMEMBER TO SET A PASSWORD FOR THE MySQL root USER !
To do so, start the server, then issue the following commands:
/usr/bin/mysqladmin -u root password 'new-password'
/usr/bin/mysqladmin -u root -h gks-1-126 password 'new-password'

Alternatively you can run:
/usr/bin/mysql_secure_installation

which will also give you the option of removing the test
databases and anonymous user created by default.  This is
strongly recommended for production servers.

See the manual for more instructions.

You can start the MySQL daemon with:
cd /usr ; /usr/bin/mysqld_safe &

You can test the MySQL daemon with mysql-test-run.pl
cd mysql-test ; perl mysql-test-run.pl

Please report any problems with the /usr/bin/mysqlbug script!

The latest information about MySQL is available on the web at
http://www.mysql.com
Support MySQL by buying support/licenses at http://shop.mysql.com
                                                          [  OK  ]
Starting MySQL:                                            [  OK  ]
The database version requested by cream service is 2.4
Impossible to retrieve the version of creamdb database. Database will be created from scratch.
Creating/Updating creamdb database...
creamdb database created!
The database version requested by cream service is 2.4
Impossible to retrieve the version of delegationdb database. Database will be created from scratch.
Creating/Updating delegationdb database...
delegationdb database created!
  INFO: Executing function: config_add_pool_env_setenv 
  INFO: Executing function: config_add_pool_env 
  INFO: Executing function: config_ldconf 
  INFO: Executing function: config_sysconfig_edg 
  INFO: Executing function: config_host_certs 
  INFO: Executing function: config_crl 
  INFO: Now updating the CRLs - this may take a few minutes...
  WARNING: /opt/glite/libexec/fetch-crl.sh didn't finish succesfully
  WARNING: CRLs may not be updated, please have a look !
  INFO: Executing function: config_vomsdir_setenv 
  INFO: Executing function: config_vomsdir 
  INFO: Executing function: config_vomses 
  INFO: Executing function: config_users 
  INFO: Executing function: config_edgusers 
  INFO: Executing function: config_cream_glexec_user_setenv 
  INFO: Executing function: config_cream_glexec_user 
  INFO: CONFIG_USERS is set to yes
  INFO: Executing function: config_cream_sudoers_setenv 
  INFO: Executing function: config_cream_sudoers 
  INFO: Executing function: config_secure_tomcat_setenv 
/usr/bin/rebuild-jar-repository: error: Could not find xml-commons-apis Java extension for this JVM
/usr/bin/rebuild-jar-repository: error: Some detected jars were not found for this jvm
  INFO: Executing function: config_secure_tomcat 
  INFO: Check that java is installed
  INFO: Check that tomcat is installed
  INFO: Stop tomcat in case it's running
Stopping tomcat5:    INFO: Copying hostcert to /etc/grid-security/tomcat-cert.pem for tomcat:root......
  INFO: Copying hostkey to /etc/grid-security/tomcat-key.pem for tomcat:root...
  INFO: Configuring /etc/tomcat5/server.xml...
  INFO: Copying trustmanager deps to tomcat server lib directory..
  INFO: Defining JAVA_HOME in the Tomcat configuration file
  INFO: Starting Tomcat
Starting tomcat5:                                          [  OK  ]
  INFO: Executing function: config_vomsmap_setenv 
  INFO: Executing function: config_vomsmap 
  INFO: Creating grid-map directory in /etc/grid-security/gridmapdir
  INFO: Creating voms grid-map file in /etc/grid-security/voms-grid-mapfile
  INFO: Creating voms groupmap file in /etc/grid-security/groupmapfile
  INFO: Copying the /etc/grid-security/voms-grid-mapfile in the standard location /etc/grid-security/grid-mapfile
  INFO: Executing function: config_globus_clients_setenv 
  INFO: Executing function: config_globus_clients 
  INFO: Configure the globus service
setup-tmpdirs: creating ./config.status
config.status: creating globus-script-initializer
config.status: creating Paths.pm
creating globus-sh-tools-vars.sh
creating globus-script-initializer
creating Globus::Core::Paths
checking globus-hostname
Done
  INFO: Executing function: config_rgma_client_setenv 
  INFO: Executing function: config_rgma_client 
  INFO: YAIM has detected the OS is SL5. The rgma client is no longer configured in SL5.
  INFO: Executing function: config_lcas_lcmaps_gt4_setenv 
  INFO: Executing function: config_lcas_lcmaps_gt4 
  INFO: Creating LCAS_DB_FILE in /opt/glite/etc/lcas/lcas.db
  INFO: Creating LCMAPS_DB_FILE in /opt/glite/etc/lcmaps/lcmaps.db
  INFO: Executing function: config_globus_gridftp_setenv 
  INFO: Executing function: config_globus_gridftp 
  INFO: Starting gridftp service :
Shutting down globus-gridftp-server: [FAILED]
Starting globus-gridftp-server[  OK  ]
  INFO: Executing function: config_cream_glexec_setenv 
  INFO: Executing function: config_cream_glexec 
  INFO: Executing function: config_cream_blah_setenv 
  INFO: Executing function: config_cream_blah 
  INFO: Executing function: config_cream_ce_setenv 
  INFO: Executing function: config_cream_ce 
Starting tomcat5: tomcat5 process already running
  INFO: Executing function: config_cream_logrotation_setenv 
  INFO: Executing function: config_cream_logrotation 
  INFO: Executing function: config_gip_only 
  INFO: Executing function: config_cream_gip_setenv 
  INFO: Executing function: config_cream_gip 
  INFO: Executing function: config_gip_scheduler_plugin_setenv 
  INFO: Executing function: config_gip_scheduler_plugin 
  INFO: Executing function: config_cream_gip_software_plugin 
  INFO: Executing function: config_gip_vo_tag 
  INFO: Executing function: config_gip_service_release 
  INFO: Executing function: config_info_service_cream_ce_setenv 
  INFO: Executing function: config_info_service_cream_ce 
  INFO: Executing function: config_info_service_cemon_setenv 
glite-lb-interlogd: no process killed
[26574] Initializing...
[26574] Parse messages for correctness... [yes]
[26574] Send messages also to inter-logger... [yes]
[26574] Messages will be stored with the filename prefix "/var/glite/log/dglogd.log".
[26574] Server running with certificate: /C=DE/O=GermanGrid/OU=dech-school/CN=gks-1-126.fzk.de
[26574] Listening on port 9002
[26574] Running as daemon... [yes]
  INFO: Executing function: config_info_service_cemon 
  INFO: Executing function: config_cream_cemon_setenv 
  INFO: Executing function: config_cream_cemon 
  INFO: Executing function: config_cream_gliteservices_setenv 
  INFO: Executing function: config_cream_gliteservices 
  INFO: Executing function: config_cream_locallogger_setenv 
  INFO: Executing function: config_cream_locallogger 
  INFO: Executing function: config_glite_locallogger_setenv 
  INFO: Executing function: config_glite_locallogger 
  INFO: Applying the workaround for bug 22389...
Stopping glite-lb-logd ... not running
Stopping glite-lb-interlogd ... not running
Starting glite-lb-logd ...This is LocalLogger, part of Workload Management System in EU DataGrid & EGEE.
done
Starting glite-lb-interlogd ... done
  INFO: Executing function: config_glite_initd 
  INFO: Executing function: config_maui_cfg_setenv 
  INFO: Executing function: config_maui_cfg 
  INFO: configuring maui ...
  INFO: Executing function: config_apel_pbs_setenv 
  INFO: Executing function: config_apel_pbs 
  INFO: Executing function: config_gip_sched_plugin_pbs_setenv 
  INFO: Executing function: config_gip_sched_plugin_pbs 
  INFO: Executing function: config_torque_submitter_ssh 
Reloading sshd: [  OK  ]
  INFO: Executing function: config_bdii_only 
Stopping BDII: BDII Already stopped
Starting SLAPD: [  OK  ]

Starting update process: [  OK  ]

  INFO: Configuration Complete.                                               [  OK  ]
  INFO: YAIM terminated succesfully 

As you may have noticed yaim gave a warning:

WARNING: /opt/glite/libexec/fetch-crl.sh didn't finish succesfully
WARNING: CRLs may not be updated, please have a look !

If you also receive this warning, please try to find out the cause for this warning. You could e.g. have a look at the contents of the /opt/glite/libexec/fetch-crl.sh script and execute them manually.

Sometimes also another error shows up:

  ERROR: Error during the execution of function: config_bdii_only
  ERROR: Error during the configuration.Exiting.                              [FAILED]
  ERROR: One of the functions returned with error without specifying it's nature !

In this case restart the bdii service manually by

sh -x /etc/init.d/bdii start

and check if it is properly starting.

/etc/init.d/bdii status

The last step is to configure the BLParser by running yaim again with the options as shown below

/opt/glite/yaim/bin/yaim -f -s /root/yaim/site-info.def -f config_cream_blparser

Go to CREAM Compute Element Testing


Go back to gLite Administration Course, CREAM CE, Installation of a CREAM CE