Toaster:UMD Condor Configuration
From Adapt
Disk Config
Disk space on terpcondor was snagged by umounting /tmp and setting it under the management of lvm.
Create physical volume and volume group.
[root@terpcondor root]# vgscan vgscan -- reading all physical volumes (this may take a while...) vgscan -- "/etc/lvmtab" and "/etc/lvmtab.d" successfully created vgscan -- WARNING: This program does not do a VGDA backup of your volume group [root@terpcondor root]# pvcreate /dev/sda5 pvcreate -- physical volume "/dev/sda5" successfully created [root@terpcondor root]# vgcreate internal-disk /dev/sda5 vgcreate -- INFO: using default physical extent size 32 MB vgcreate -- INFO: maximum logical volume size is 2 Terabyte vgcreate -- doing automatic backup of volume group "internal-disk" vgcreate -- volume group "internal-disk" successfully created and activated
Re-create tmp space.
[root@terpcondor root]# lvcreate -L2G -ntmp internal-disk lvcreate -- doing automatic backup of "internal-disk" lvcreate -- logical volume "/dev/internal-disk/tmp" successfully created [root@terpcondor root]# mke2fs -j /dev/internal-disk/tmp mke2fs 1.32 (09-Nov-2002) Filesystem label= OS type: Linux Block size=4096 (log=2) Fragment size=4096 (log=2) 262144 inodes, 524288 blocks 26214 blocks (5.00%) reserved for the super user First data block=0 16 block groups 32768 blocks per group, 32768 fragments per group 16384 inodes per group Superblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912 Writing inode tables: done Creating journal (8192 blocks): done Writing superblocks and filesystem accounting information: done This filesystem will be automatically checked every 29 mounts or 180 days, whichever comes first. Use tune2fs -c or -i to override.
Create condor volume, this will store software and configs
[root@terpcondor root]# lvcreate -L2G -ncondor internal-disk lvcreate -- doing automatic backup of "internal-disk" lvcreate -- logical volume "/dev/internal-disk/condor" successfully created [root@terpcondor root]# mke2fs -j /dev/internal-disk/condor mke2fs 1.32 (09-Nov-2002) Filesystem label= OS type: Linux Block size=4096 (log=2) Fragment size=4096 (log=2) 262144 inodes, 524288 blocks 26214 blocks (5.00%) reserved for the super user First data block=0 16 block groups 32768 blocks per group, 32768 fragments per group 16384 inodes per group Superblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912 Writing inode tables: done Creating journal (8192 blocks): done Writing superblocks and filesystem accounting information: done This filesystem will be automatically checked every 31 mounts or 180 days, whichever comes first. Use tune2fs -c or -i to override.
Now config home space in /export
[root@terpcondor root]# lvcreate -L2G -nhomes internal-disk lvcreate -- doing automatic backup of "internal-disk" lvcreate -- logical volume "/dev/internal-disk/homes" successfully created [root@terpcondor root]# mke2fs -j /dev/internal-disk/homes mke2fs 1.32 (09-Nov-2002) Filesystem label= OS type: Linux Block size=4096 (log=2) Fragment size=4096 (log=2) 262144 inodes, 524288 blocks 26214 blocks (5.00%) reserved for the super user First data block=0 16 block groups 32768 blocks per group, 32768 fragments per group 16384 inodes per group Superblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912 Writing inode tables: done Creating journal (8192 blocks): done Writing superblocks and filesystem accounting information: done This filesystem will be automatically checked every 35 mounts or 180 days, whichever comes first. Use tune2fs -c or -i to override.
Condor work directory (normally this just goes in /var/condor, but since this install is 100% relocatable, it's going in /export
[root@terpcondor root]# lvcreate -ncondor-var -L2G internal-disk lvcreate -- doing automatic backup of "internal-disk" lvcreate -- logical volume "/dev/internal-disk/condor-var" successfully created [root@terpcondor root]# mke2fs -j /dev/internal-disk/condor-var mke2fs 1.32 (09-Nov-2002) Filesystem label= OS type: Linux Block size=4096 (log=2) Fragment size=4096 (log=2) 262144 inodes, 524288 blocks 26214 blocks (5.00%) reserved for the super user First data block=0 16 block groups 32768 blocks per group, 32768 fragments per group 16384 inodes per group Superblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912 Writing inode tables: done Creating journal (8192 blocks): done Writing superblocks and filesystem accounting information: done This filesystem will be automatically checked every 36 mounts or 180 days, whichever comes first. Use tune2fs -c or -i to override.
Add to /etc/fstab
/dev/internal-disk/tmp /tmp ext3 defaults 1 2 /dev/internal-disk/condor /export/condor ext3 defaults 1 2 /dev/internal-disk/condor-var /export/condor-var ext3 defaults 1 2 /dev/internal-disk/homes /export/homes ext3 defaults 1 2
Condor Base Install
Condor will be installed in /export/condor/condor-6.6.10 with a symlink created to /export/condor/condor. All scripts and such will use the symlink to aid in an easier upgrade. To make upgrades a truly copy binaries, the condor_config sits in /export/condor/condor_config. The environment CONDOR_CONFIG should be set so that condor can find this file.
In ~condor/condor-6.6.10
./condor_install Would you like to do a full installation of Condor? [yes] ... ... Are you planning to setup Condor on multiple machines? [yes] Will all the machines share files via a file server? [yes] no ... ... Have you installed a release directory already? [no] Where would you like to install the Condor release directory? [/usr/local/condor] /export/condor/condor-6.6.10 That directory doesn't exist, should I create it now? [yes] Installing a release directory into /export/condor/condor-6.6.10 ... ... ... [root@terpcondor.umiacs.umd.edu] root@umiacs.umd.edu What is the full path to a mail program that understands "-s" means you want to specify a subject? [/bin/mail] ... ... Do all of the machines in your pool from your domain ("umiacs.umd.edu") share a common filesystem? [no] Configuring each machine to be in its own filesystem domain. Do all of the users across all the machines in your domain have a unique UID (in other words, do they all share a common passwd file)? [no] Configuring each machine to be in its own uid domain. ... ... Enable Java Universe support? [yes] I wasn't able to find a valid JVM. Please enter the full path to the JVM, or "none" to leave unconfigured: /opt/j2sdk1.4.2/bin/java You entered: /opt/j2sdk1.4.2/bin/java Is that right? [no] yes Checking to see if you have a Sun JVM...yes. Using JVM /opt/j2sdk1.4.2/bin/java for Java universe support. ... ... Shall I create links in some other directory? [yes] Where should I install these files? [/usr/local/bin] /usr/bin ... ... What is the full hostname of the central manager? [terpcondor.umiacs.umd.edu] ... ... You have a "condor" user on this machine. Do you want to put all the Condor directories in /export/homes/condor? [yes] no Do you want to put all the Condor directories in /export/condor/condor-6.6.10/home? [yes] no Where do you want the Condor directories? /export/condor-var Creating all necessary Condor directories ... done. ... ... Should I put a "condor_config.local" file in /export/condor-var? [yes] Creating config files in "/export/condor-var" ... done. Configuring global condor config file ... done. Created /export/condor/condor-6.6.10/etc/condor_config. ... ... Setting up terpcondor.umiacs.umd.edu as your central manager What name would you like to use for this pool? This should be a short description (20 characters or so) that describes your site. For example, the name for the UW-Madison Computer Science Condor Pool is: "UW-Madison CS". This value is stored in your central manager's local config file as "COLLECTOR_NAME", if you decide to change it later. (This shouldn't include any " marks). UM College Park ... ... Should I put in a soft link from /export/homes/condor/condor_config to /export/condor/condor-6.6.10/etc/condor_config [yes] Installing links for public binaries into /usr/bin ... done.
Post-Install cleanup
To make condor a little more machine and install independant, we moved the condor_config to /export/condor/condor_config
[root@terpcondor condor]# cd ~condor [root@terpcondor condor]# ln -s /export/condor/condor_config [root@terpcondor condor]# cd /export/condor [root@terpcondor condor]# ln -s condor-6.6.10 condor [root@terpcondor condor]# mv condor-6.6.10/etc/condor_config .
First Start
Now, we should have a working condor install. This should be able to startup using the UMcondor init script. You'll notice that a few too many daemons startup. We'll fix this next.
(in ~condor)
[root@terpcondor condor]# ./UMcondor start [root@terpcondor condor]# ps -ef|grep cond squid 12859 1 0 12:48 ? 00:00:00 /opt/stow/condor/sbin/condor_master squid 12860 12859 0 12:48 ? 00:00:00 condor_collector -f squid 12861 12859 0 12:48 ? 00:00:00 condor_negotiator -f squid 12862 12859 22 12:48 ? 00:00:05 condor_startd -f squid 12864 12859 0 12:48 ? 00:00:00 condor_schedd -f root 12890 6380 0 12:48 pts/0 00:00:00 grep cond [root@terpcondor condor]# ./UMcondor stop Shutting down Condor (fast-shutdown mode)
Local Configuration
In /export/condor/condor_config you'll need to make the following changes
# version independant release dir RELEASE_DIR = /export/condor/condor # allow umiacs jobs to flow to this box FLOCK_FROM = *.umiacs.umd.edu # allow us to admin the box remotely HOSTALLOW_ADMINISTRATOR = $(CONDOR_HOST), loach.umiacs.umd.edu # only allow umd hosts to join this pool HOSTALLOW_READ = *.umd.edu HOSTALLOW_WRITE = *.umd.edu
now, for the local config in /export/condor-var/condor_config.local
DAEMON_LIST = MASTER, COLLECTOR, NEGOTIATOR, SCHEDD
Set Condor to start on boot
Copy init script and start service.
[root@terpcondor condor]# cp UMcondor /etc/init.d [root@terpcondor condor]# /sbin/chkconfig UMcondor on [root@terpcondor condor]# /sbin/service UMcondor start [root@terpcondor condor]# ps -ef|grep cond squid 13393 1 0 13:04 ? 00:00:00 /export/condor/condor/sbin/condor_master squid 13394 13393 0 13:04 ? 00:00:00 condor_collector -f squid 13396 13393 0 13:04 ? 00:00:00 condor_negotiator -f squid 13397 13393 0 13:04 ? 00:00:00 condor_schedd -f
Test Condor
On a standard umiacs host, add in /var/condor/condor_config.local
CONDOR_HOST = terpcondor.umiacs.umd.edu FLOCK_FROM = rogueleader.umiacs.umd.edu
In the global umiacs config, we added a 'FLOCK_TO = termcondor...' line to allow jobs to flow to the new condor pool
-- Main.MikeSmorul - 17 Aug 2005