login webmail hrvatski
FAQ
IRB: Bijenička 54, HR-10000 Zagreb. tel: +385 (0)1 4561-111, fax: 4680-084, PR: 4571-269, mail: info@irb.hr
... Projects Internal Projects Debian Cluster Components FAQ
search people contact where we are? sitemap help print history Bookmark and Share

DCC/Debian cluster FAQ

Nikola Pavkovic

Valentin Vidic

$Id: faq.sgml,v 1.16 2005/03/21 16:01:02 nix Exp $

$Date: 2005/03/21 16:01:02 $

This is a FAQ (Frequently Asked Questions) on DCC/Debian based cluster administration. The reader is expected to have some prior knowledge about common system administration tasks on a Debian based system.


1. User administration
1.1. Adding user accounts
1.2. Deleting user accounts
1.3. Modifying user accounts
2. Software administration
2.1. Installing new software on the work-nodes
2.2. Installing a new kernel on the work-nodes
2.3. Switching between MPICH and LAM
3. The queuing system
3.1. Submitting a job
3.2. Monitoring the job
3.3. Queueing system configuration
4. Miscellaneous
4.1. Changing nodes' partition configuration after installation
4.2. Changing the front-node's external hostname after installation
4.3. Work-nodes can't communicate
4.4. Security issues
4.5. How DCC installation works
4.6. Does it work on IA64?

1. User administration

1.1. Adding user accounts

To add a user to the cluster use the dcc_useradd(8) script, for example:

# dcc_useradd -m bob
This will add user bob to the user database and create a home directory for him (-m option). For the complete list of available command line options see cpu-ldap(8).

1.2. Deleting user accounts

To delete a user from the cluster use the cpu(8) command, for example:

# cpu userdel -r bob
This will delete user bob from the user database and remove his home directory (-r option). Again, read cpu-ldap(8) for the list of available options.

1.3. Modifying user accounts

To change the user's password, use the passwd(1) command:

# passwd bob
bob@cluster$ passwd

To change the default shell for the user, use the chsh(1) command:

# chsh bob
bob@cluster$ chsh

To change the Geckos field for the user, use the chfn(1) command:

# chfn bob
bob@cluster$ chfn

2. Software administration

2.1. Installing new software on the work-nodes

First enter the work-node image:

# dcc_editimage node-image
Then install the software like:
CHROOT# apt-get install libpvm3
or like:
CHROOT# ./configure && make && make install
Finally, update the work-nodes:
# cpushimage node_image

NoteIf upgrading/installing a daemon
 

If you are installing/upgrading a package that provides a server daemon (which is started from /etc/init.d/...), take care that you restart the relevant service on the actual node after the image is pushed to it. It is done like this:

# cexec /etc/init.d/service-name stop
# cexec /etc/init.d/service-name start
Of course, change 'service-name' to the desired service you are to restart.

2.2. Installing a new kernel on the work-nodes

Install the kernel from the package:

CHROOT# apt-get install kernel-image-2.4-686-smp
or use the source:
CHROOT/usr/src/linux-2.4.28# make menuconfig
CHROOT/usr/src/linux-2.4.28# make dep
CHROOT/usr/src/linux-2.4.28# make bzImage modules
CHROOT/usr/src/linux-2.4.28# make modules_install
CHROOT/usr/src/linux-2.4.28# cp arch/i386/boot/bzImage /boot/vmlinuz-2.4.28
CHROOT/usr/src/linux-2.4.28# cp System.map /boot/System.map-2.4.28
Update the list of kernel-images:
# mksidisk -A --file /etc/dcc/disktable --name image_name
Push the changes to the work-nodes:
# cpushimage node_image
Before rebooting the work-nodes, check that systemconfigurator(1p) configured the boot-loader on the work-nodes correctly (e.g. /etc/lilo.conf).

2.3. Switching between MPICH and LAM

In the default installation both MPICH and LAM are installed. MPICH is used by default. To temporarily switch to LAM use mpirun.lam, mpicc.lam etc. To permanently switch to LAM use:

# update-alternatives --config mpi
# update-alternatives --config mpirun

3. The queuing system

3.1. Submitting a job

To submit a job to the queueing system use qsub(1). If you want an interactive session, use it like this:

$ qsub -I
That will open an interactive session on a free node within the cluster. As you are finished, just exit the shell, and that's it. This is the most simple method of running jobs through the queueing system. However, your session does not need to be interactive. You can create a script containing shell commands you want to execute, and submit the script to the queueing system:
$ qsub /home/user/testjob.sh
The queueing system will run the script on one of the work-nodes. It is possible to describe the needed resources in detail, for example:
$ qsub -l nodes=2:ppn=2 /home/user/testjob.sh
The requested resources are two work-nodes (nodes=2) with two processors each (ppn=2). For a detailed list of available parameters check Job submission section of the Torque manual.

3.2. Monitoring the job

As you have successfully submitted your job to the queueing system, you are able to monitor it's status. It is done either with qstat(1), pbstop or pestat commands.

Additionally, you can monitor overall cluster performance at your Ganglia cluster monitor web interface. Your local Ganglia URL is http://your.cluster.fqdn/ganglia

3.3. Queueing system configuration

For information on fine-tuning your queueing-system configuration please refer to the TORQUE Admin Manual

4. Miscellaneous

4.1. Changing nodes' partition configuration after installation

If you want to change disk partition configuration on the work-nodes, first change the settings in /etc/dcc/disktable so that it reflects your desired partitioning scheme. After that, you have to issue two commands for the changes to take effect:

mksidisk -A --name node --file /etc/dcc/disktable
mkautoinstallscript --image node --force --ip-assignment dhcp --post-install reboot
Finally, reboot your work-nodes for the installation process to take place. For more information on above two commands, consult mksidisk(1) and mkautoinstallscript(8) man pages.

4.2. Changing the front-node's external hostname after installation

To change the front-node's external hostname after installation, you have to update following configuration files on the front-node: /etc/hosts , /etc/hostname , /etc/network/interfaces , /etc/torque/server_name , /etc/c3.conf and /etc/gmond.conf. You will also want to reconfigure your MTA, which can be done like this (if you're using exim4):

# dpkg-reconfigure exim4-config
Finally, reboot the cluster, to check if everything works correctly.

4.3. Work-nodes can't communicate

If the jobs spanning multiple work-nodes won't start try running the following two commands on the front-node:

# cpushimage image_name
# /etc/init.d/torque-server restart-quick
# cexec /etc/init.d/torque-mom restart-quick
This is required because various files can get out of sync if the nodes are installed (and booted) one after the other. Consider the following sequence of events:

  1. dcc_discovernode creates node1

  2. node1 gets installed

  3. node1 reboots

  4. dcc_discovernode creates node2

  5. node2 gets installed

  6. node2 reboots

Node1 doesn't "know" (from /etc/hosts, /etc/ssh_known_hosts) about node2 because it was installed(2) before node2 was created(4). Therefore it is necessary to update the node1 using cpushimage. The same goes for torque-mom on the work-nodes. When torque-mom starts(3) on node1 it receives a list of work-nodes from torque-server on the front-node. That list contains only one work-node present at that time: node1 itself. Work-nodes use that list for authenticating connections from other work-nodes: only connections from known work-nodes are allowed. If node2 tries to connect to node1 the connection will fail because node2 is not in the node1's list of allowed clients. This is why it is necessary to restart torque-mom after all the nodes are installed. After torque-moms start again they will receive the correct list of allowed clients and therefore the connections between torque-moms will work. The fast restart is used here so that possible running jobs don't get killed. If you don't restart torque-moms you may experience jobs that are scheduled to run but never start. When this happens torque-mom logs a "bad connect from IP - unauthorized" message.

4.4. Security issues

As you start working on DCC clusters you may notice that root password is not set on the work-nodes, root's SSH key doesn't exist on the work-nodes etc. We decided on this because it is not safe to put sensitive information into the images. Images are served anonymously via rsync(1) so any cluster user can retrieve sensitive data from the image:

bob@cluster$ rsync -v rsync://node0/image_name/etc/shadow .
shadow

sent 91 bytes  received 784 bytes  1750.00 bytes/sec
total size is 679  speedup is 0.78
While it is possible to use rsync or SSH password authentication, there is no easy fix that would still allow for unattended work-node installation. Therefore, be careful about putting sensitive information in the images.

4.5. How DCC installation works

debconf-dcc package is installed first. It will ask you (through debconf) to enter the names of front-end's internal and external interface. It will deduce other information from these answers (using ifconfig(8) and friends). As a result, /etc/dcc/config and /etc/dcc/debconf are created. /etc/dcc/config is the main DCC configuration file, and /etc/dcc/debconf contains a debconf database with preloaded answers for packages DCC depends on. This way when dcc-front and its dependencies (like slapd, ssh etc.) are installed they won't ask you for any questions. Hence you have a silent install. Moreover, dcc-front depends on a series of packages whose only purpose is to configure existing Debian packages. slapd-dcc will, for example, configure LDAP server to be used as an user database for the cluster. For details look at the postinst scripts of those packages.

DCC installation in the work-node image is similar. After the debootstrap builds the base system, /etc/dcc/config is copied into the image. When debconf-dcc is installed in the image it will use this file instead of asking questions. Just like dcc-front on the front-node, dcc-node and its dependencies will be silently installed because of the preloaded answers to debconf questions (/etc/dcc/debconf). See dcc_buildimage(8) for details of the image build process.

Although all this makes the installation simple, it also violates the strict Debian policy guidelines. slapd-dcc, for example, modifies /etc/ldap/slapd.conf which is owned by another package. Moreover, there is no sense in installing e.g. slapd-dcc if debconf-dcc isn't installed first. This applies to the other DCC packages too. Also there might be other issues not mentioned here.

4.6. Does it work on IA64?

Several manual steps are required. More info in the HOWTO provided by Gordon Grubert.