Configuration of the CREAM CE
site-info.def
Now get your site-info.def from your site BDII or start with the example file /opt/glite/yaim/examples/siteinfo/site-info.def and save it somewhere you like, e.g. /root/yaim/ or /opt/glite/yaim/ and make sure that it is only readable by root.
mkdir /root/yaim cd /root/yaim scp gks-X-XYZ:/root/yaim/site-info.def /root/yaim chmod 600 /root/yaim
users.conf
For every supported virtual organization you need to define users in the users.conf file. For details please go to users.conf.
groups.conf
For every group of a virtual organization you are supporting you need to define it in the groups.conf file. For details please refer to groups.conf.
wn-list.conf
For the TORQUE and the CREAM Compute Element you need a file where all your worker nodes are listed. By default the file should be stored at /opt/glite/yaim/etc/wn-list.conf. In this course you have only one host available for a worker node, so create a file /opt/glite/yaim/etc/wn-list.conf and add its full hostname.
hostkeys
As for most grid services you need a host certificate for the CREAM Compute Element with the right permissions. For more details please refer to Host certificates.
Variables
Before the configuration script can be run the site-info.def has to be adapted. A good overview on the parameters available can be found in the LCG Wiki at CERN. Required Variables are:
SITE_NAME=GKSXYZ CE_HOST="gks-X-XYZ.fzk.de" BATCH_SERVER="gks-X-XYZ.fzk.de" USERS_CONF=/opt/glite/yaim/etc/users.conf GROUPS_CONF=/opt/glite/yaim/etc/groups.conf WN_LIST=/opt/glite/yaim/etc/wn-list.conf JAVA_LOCATION=/usr/java/default JOB_MANAGER=pbs CE_BATCH_SYS=pbs BATCH_VERSION=torque-2.3.6-2 BATCH_BIN_DIR=/usr/bin BATCH_LOG_DIR=/var/spool/pbs/ BATCH_SPOOL_DIR=/var/spool/pbs/ CREAM_CE_STATE="Special" BLPARSER_HOST=$CE_HOST BLAH_JOBID_PREFIX="cream_" BLPARSER_WITH_UPDATER_NOTIFIER="false" MYSQL_PASSWORD=koala APEL_DB_PASSWORD="APELDB_PWD" # MON_HOST=mon.example.org # use this only if you still run a (depreceated) mon host APEL_MYSQL_HOST="gks-X-XYZ" # point to APEL host CE_OS_ARCH=x86_64 # "uname -i" executed on the WN CE_OS=ScientificSL # "lsb_release -i | cut -f2" executed on the WN CE_OS_RELEASE=5.4 # "lsb_release -r | cut -f2" executed on the WN CE_OS_VERSION=Boron # "lsb_release -c | cut -f2" executed on the WN CE_CPU_MODEL=Opteron # "grep 'model name' /proc/cpuinfo" executed on the WN - just take the main description (e.g. Opteron or Xeon) CE_CPU_VENDOR=AMD # "grep vendor /proc/cpuinfo" executed on the WN - just take the company name (e.g. AMD or Intel) CE_CPU_SPEED=1800 # "grep MHz /proc/cpuinfo" executed on the WN CE_MINPHYSMEM=1024 # "grep MemTotal /proc/meminfo" executed on the WN - attention: value is shown in kB, CE_MINPHYSMEM must be in MB! CE_MINVIRTMEM=2048 # "grep SwapTotal /proc/meminfo" executed on the WN - attention: value is shown in kB, CE_MINVIRTMEM must be in MB! CE_OUTBOUNDIP=TRUE # do the WNs have permission for direct outbound connectivity? CE_INBOUNDIP=FALSE # do the WNs have permission for inbound connectivity? CE_RUNTIMEENV=" # list of supported middleware/software GLITE-3_2_0 R-GMA " CE_CAPABILITY="none" # if VO fairshares are defined in the batch system, they need to be reflected here, otherwise this variable must be none CE_OTHERDESCR="Cores=4" # number of cores per CPU for a typical WN ("grep -c "physical id.*: 0" /proc/cpuinfo") CE_PHYSCPU=2 # total number of CPUs in the batch system ("grep -c "core id.*: 0" /proc/cpuinfo" * #WN) CE_LOGCPU=1 # number of logical CPUs or job slots in the batch cluster ("grep -c processor /proc/cpuinfo" * #WN) CE_SMPSIZE=1 # number logical CPUs per Node ("grep -c processor /proc/cpuinfo") CE_SI00=1592 # SpecInt value (see [1] for example values) CE_SF00=1927 # SpecFP value (see [2] for example values) CE_OTHER_DESCR="Cores=1, Benchmark=1.11-HEP-SPEC06" # number of cores per CPU and Version of SPEC Benchmark SE_LIST="gks-se.fzk.de" # List of CloseSEs SE_MOUNT_INFO_LIST="none" # Mount info of the SE. If not supported by SE put "none" here. VOS="dech" # supported VOs QUEUES="test" # The name of the queues defined in the Batch System TEST_GROUP_ENABLE="dech" # format of this parameter is: $QUEUE_GROUP_ENABLE e.g. for queue "test" and VO "dech" put TEST_GROUP_ENABLE="dech" here ACCESS_BY_DOMAIN=false # allow access to CREAM DB from other computers? CREAM_DB_USER=creamdb # Cream DB user name CREAM_DB_PASSWORD="secretPassword" # CREAM DB password BLPARSER_HOST=$CE_HOST # Fully qualified name of machine hosting the BLAH blparser BLP_PORT=33333 # Port where BLAH Blparser listens to CREAM_PORT=56565 # Port to access CREAM CE CEMON_HOST=$CE_HOST # Fully qualified name of CEMon host (do not use localhost !)
In addition some information for the supported virtual organizations have to be set. See VO configuration.
Configuration with yaim
NOTE: The function lists for the yaim configuration are located in /opt/glite/yaim/node-info.d/. You can have a look at the lists on the machines or in the wiki under category "Function lists". For the CREAM Compute Element configuration without a batch server you need the lists for glite-creamCE and glite-TORQUE_utils:
Functions used for the configuration of the CREAM Compute Element: Function list CREAM CE
Functions used for the configuration of the Torque utils: Function list TORQUE_utils
The accounting service running on the CREAM compute element will periodically check for new data in the directory /var/spool/pbs/server_priv/accounting. In our setup, this directory does not exist on the CREAM CE, but at the batch system server and is also filled by it. To allow the proper operation of the accounting service on the CREAM CE, we need to export this directory from the batch system server to the compute element.
- Create the accounting directory on the CE and mount it from the batch system
mkdir -p /var/spool/pbs/server_priv/accounting mount 141.52.174.XYZ:/var/spool/pbs/server_priv/accounting /var/spool/pbs/server_priv/accounting
- If mounting the directory works, it needs to be included in the fstab. Add the following line to /etc/fstab
141.52.174.XYZ:/var/spool/pbs/server_priv/accounting /var/spool/pbs/server_priv/accounting nfs rw,soft 0 0
Similar steps are required for another service located at the CREAM compute element. The BLParser service is responsible for parsing the batch system server log files. By parsing the log files, the BLParser gains information about the status (queued, running, done, aborted,...) of the various grid jobs in the batch system. For proper operation you need to mount the log files from the batch system server to the compute element.
mount 141.52.174.XYZ:/var/spool/pbs/server_logs /var/spool/pbs/server_logs # on the CE command line
or alternatively
141.52.174.XYZ:/var/spool/pbs/server_logs /var/spool/pbs/server_logs nfs rw,soft 0 0 # in the CE /etc/fstab mount -a # on the command line on the CE
Now call yaim for the node type creamCE and TORQUE_utils:
/opt/glite/yaim/bin/yaim -c -s /root/yaim/site-info.def -n glite-creamCE -n glite-TORQUE_utils
The output of yaim is listed below. Your output should look similar to it.
INFO: Using site configuration file: /root/yaim/site-info.def INFO: ################################################################### . /'.-. ') . yA,-"-,( ,m,:/ ) .oo. oo o ooo o. .oo . / .-Y a a Y-. 8. .8' 8'8. 8 8b d'8 . / ~ ~ / 8' .8oo88. 8 8 8' 8 . (_/ '====' 8 .8' 8. 8 8 Y 8 . Y,--,Yy,-.,/ o8o o8o o88o o8o o8o o8o . I_))_) I_))_) current working directory: /root/yaim site-info.def date: Sep 1 12:19 /root/yaim/site-info.def yaim command: -c -s /root/yaim/site-info.def -n glite-creamCE -n glite-TORQUE_utils log file: /opt/glite/yaim/bin/../log/yaimlog Wed Sep 1 12:19:58 CEST 2010 : /opt/glite/yaim/bin/yaim Installed YAIM versions: glite-yaim-core 4.0.12-1 glite-yaim-cream-ce 4.1.0-14 glite-yaim-torque-utils 4.0.4-1 #################################################################### INFO: The default location of the grid-env.(c)sh files will be: /opt/glite/etc/profile.d INFO: Sourcing the utilities in /opt/glite/yaim/functions/utils INFO: Detecting environment INFO: Executing function: config_cream_stop_check INFO: Executing function: config_cream_clean_check INFO: Executing function: config_cream_db_check INFO: Executing function: config_add_pool_env_check INFO: Executing function: config_host_certs_check INFO: Executing function: config_vomsdir_check INFO: Executing function: config_vomses_check INFO: Executing function: config_users_check INFO: Executing function: config_edgusers_check INFO: Executing function: config_cream_glexec_user_check INFO: Executing function: config_cream_sudoers_check INFO: Executing function: config_secure_tomcat_check INFO: Executing function: config_vomsmap_check INFO: Executing function: config_globus_clients_check INFO: Executing function: config_rgma_client_check INFO: Executing function: config_lcas_lcmaps_gt4_check INFO: Executing function: config_globus_gridftp_check INFO: Executing function: config_cream_glexec_check INFO: Executing function: config_cream_blah_check INFO: Executing function: config_cream_ce_check INFO: Executing function: config_cream_logrotation_check INFO: Executing function: config_cream_gip_check INFO: Executing function: config_gip_scheduler_plugin_check INFO: Executing function: config_gip_vo_tag_check INFO: Executing function: config_info_service_cream_ce_check INFO: Executing function: config_info_service_cemon_check INFO: Executing function: config_cream_cemon_check INFO: Executing function: config_cream_gliteservices_check INFO: Executing function: config_cream_locallogger_check INFO: Executing function: config_glite_locallogger_check INFO: Executing function: config_maui_cfg_check INFO: Executing function: config_apel_pbs_check INFO: Executing function: config_gip_sched_plugin_pbs_check INFO: Executing function: config_torque_submitter_ssh_check INFO: Executing function: config_bdii_only_check INFO: Executing function: config_cream_stop_setenv INFO: Executing function: config_cream_stop INFO: blah not running INFO: tomcat not running INFO: blah not running INFO: lb processes not running INFO: Executing function: config_cream_clean_setenv INFO: Executing function: config_cream_clean INFO: Executing function: config_cream_db_setenv INFO: Executing function: config_cream_db Initializing MySQL database: Installing MySQL system tables... OK Filling help tables... OK To start mysqld at boot time you have to copy support-files/mysql.server to the right place for your system PLEASE REMEMBER TO SET A PASSWORD FOR THE MySQL root USER ! To do so, start the server, then issue the following commands: /usr/bin/mysqladmin -u root password 'new-password' /usr/bin/mysqladmin -u root -h gks-1-126 password 'new-password' Alternatively you can run: /usr/bin/mysql_secure_installation which will also give you the option of removing the test databases and anonymous user created by default. This is strongly recommended for production servers. See the manual for more instructions. You can start the MySQL daemon with: cd /usr ; /usr/bin/mysqld_safe & You can test the MySQL daemon with mysql-test-run.pl cd mysql-test ; perl mysql-test-run.pl Please report any problems with the /usr/bin/mysqlbug script! The latest information about MySQL is available on the web at http://www.mysql.com Support MySQL by buying support/licenses at http://shop.mysql.com [ OK ] Starting MySQL: [ OK ] The database version requested by cream service is 2.4 Impossible to retrieve the version of creamdb database. Database will be created from scratch. Creating/Updating creamdb database... creamdb database created! The database version requested by cream service is 2.4 Impossible to retrieve the version of delegationdb database. Database will be created from scratch. Creating/Updating delegationdb database... delegationdb database created! INFO: Executing function: config_add_pool_env_setenv INFO: Executing function: config_add_pool_env INFO: Executing function: config_ldconf INFO: Executing function: config_sysconfig_edg INFO: Executing function: config_host_certs INFO: Executing function: config_crl INFO: Now updating the CRLs - this may take a few minutes... WARNING: /opt/glite/libexec/fetch-crl.sh didn't finish succesfully WARNING: CRLs may not be updated, please have a look ! INFO: Executing function: config_vomsdir_setenv INFO: Executing function: config_vomsdir INFO: Executing function: config_vomses INFO: Executing function: config_users INFO: Executing function: config_edgusers INFO: Executing function: config_cream_glexec_user_setenv INFO: Executing function: config_cream_glexec_user INFO: CONFIG_USERS is set to yes INFO: Executing function: config_cream_sudoers_setenv INFO: Executing function: config_cream_sudoers INFO: Executing function: config_secure_tomcat_setenv /usr/bin/rebuild-jar-repository: error: Could not find xml-commons-apis Java extension for this JVM /usr/bin/rebuild-jar-repository: error: Some detected jars were not found for this jvm INFO: Executing function: config_secure_tomcat INFO: Check that java is installed INFO: Check that tomcat is installed INFO: Stop tomcat in case it's running Stopping tomcat5: INFO: Copying hostcert to /etc/grid-security/tomcat-cert.pem for tomcat:root...... INFO: Copying hostkey to /etc/grid-security/tomcat-key.pem for tomcat:root... INFO: Configuring /etc/tomcat5/server.xml... INFO: Copying trustmanager deps to tomcat server lib directory.. INFO: Defining JAVA_HOME in the Tomcat configuration file INFO: Starting Tomcat Starting tomcat5: [ OK ] INFO: Executing function: config_vomsmap_setenv INFO: Executing function: config_vomsmap INFO: Creating grid-map directory in /etc/grid-security/gridmapdir INFO: Creating voms grid-map file in /etc/grid-security/voms-grid-mapfile INFO: Creating voms groupmap file in /etc/grid-security/groupmapfile INFO: Copying the /etc/grid-security/voms-grid-mapfile in the standard location /etc/grid-security/grid-mapfile INFO: Executing function: config_globus_clients_setenv INFO: Executing function: config_globus_clients INFO: Configure the globus service setup-tmpdirs: creating ./config.status config.status: creating globus-script-initializer config.status: creating Paths.pm creating globus-sh-tools-vars.sh creating globus-script-initializer creating Globus::Core::Paths checking globus-hostname Done INFO: Executing function: config_rgma_client_setenv INFO: Executing function: config_rgma_client INFO: YAIM has detected the OS is SL5. The rgma client is no longer configured in SL5. INFO: Executing function: config_lcas_lcmaps_gt4_setenv INFO: Executing function: config_lcas_lcmaps_gt4 INFO: Creating LCAS_DB_FILE in /opt/glite/etc/lcas/lcas.db INFO: Creating LCMAPS_DB_FILE in /opt/glite/etc/lcmaps/lcmaps.db INFO: Executing function: config_globus_gridftp_setenv INFO: Executing function: config_globus_gridftp INFO: Starting gridftp service : Shutting down globus-gridftp-server: [FAILED] Starting globus-gridftp-server[ OK ] INFO: Executing function: config_cream_glexec_setenv INFO: Executing function: config_cream_glexec INFO: Executing function: config_cream_blah_setenv INFO: Executing function: config_cream_blah INFO: Executing function: config_cream_ce_setenv INFO: Executing function: config_cream_ce Starting tomcat5: tomcat5 process already running INFO: Executing function: config_cream_logrotation_setenv INFO: Executing function: config_cream_logrotation INFO: Executing function: config_gip_only INFO: Executing function: config_cream_gip_setenv INFO: Executing function: config_cream_gip INFO: Executing function: config_gip_scheduler_plugin_setenv INFO: Executing function: config_gip_scheduler_plugin INFO: Executing function: config_cream_gip_software_plugin INFO: Executing function: config_gip_vo_tag INFO: Executing function: config_gip_service_release INFO: Executing function: config_info_service_cream_ce_setenv INFO: Executing function: config_info_service_cream_ce INFO: Executing function: config_info_service_cemon_setenv glite-lb-interlogd: no process killed [26574] Initializing... [26574] Parse messages for correctness... [yes] [26574] Send messages also to inter-logger... [yes] [26574] Messages will be stored with the filename prefix "/var/glite/log/dglogd.log". [26574] Server running with certificate: /C=DE/O=GermanGrid/OU=dech-school/CN=gks-1-126.fzk.de [26574] Listening on port 9002 [26574] Running as daemon... [yes] INFO: Executing function: config_info_service_cemon INFO: Executing function: config_cream_cemon_setenv INFO: Executing function: config_cream_cemon INFO: Executing function: config_cream_gliteservices_setenv INFO: Executing function: config_cream_gliteservices INFO: Executing function: config_cream_locallogger_setenv INFO: Executing function: config_cream_locallogger INFO: Executing function: config_glite_locallogger_setenv INFO: Executing function: config_glite_locallogger INFO: Applying the workaround for bug 22389... Stopping glite-lb-logd ... not running Stopping glite-lb-interlogd ... not running Starting glite-lb-logd ...This is LocalLogger, part of Workload Management System in EU DataGrid & EGEE. done Starting glite-lb-interlogd ... done INFO: Executing function: config_glite_initd INFO: Executing function: config_maui_cfg_setenv INFO: Executing function: config_maui_cfg INFO: configuring maui ... INFO: Executing function: config_apel_pbs_setenv INFO: Executing function: config_apel_pbs INFO: Executing function: config_gip_sched_plugin_pbs_setenv INFO: Executing function: config_gip_sched_plugin_pbs INFO: Executing function: config_torque_submitter_ssh Reloading sshd: [ OK ] INFO: Executing function: config_bdii_only Stopping BDII: BDII Already stopped Starting SLAPD: [ OK ] Starting update process: [ OK ] INFO: Configuration Complete. [ OK ] INFO: YAIM terminated succesfully
As you may have noticed yaim gave a warning:
WARNING: /opt/glite/libexec/fetch-crl.sh didn't finish succesfully WARNING: CRLs may not be updated, please have a look !
If you also receive this warning, please try to find out the cause for this warning. You could e.g. have a look at the contents of the /opt/glite/libexec/fetch-crl.sh script and execute them manually.
Sometimes also another error shows up:
ERROR: Error during the execution of function: config_bdii_only ERROR: Error during the configuration.Exiting. [FAILED] ERROR: One of the functions returned with error without specifying it's nature !
In this case restart the bdii service manually by
sh -x /etc/init.d/bdii start
and check if it is properly starting.
/etc/init.d/bdii status
The last step is to configure the BLParser by running yaim again with the options as shown below
/opt/glite/yaim/bin/yaim -f -s /root/yaim/site-info.def -f config_cream_blparser
Go to CREAM Compute Element Testing
Go back to gLite Administration Course, CREAM CE, Installation of a CREAM CE