Cockcroft Institute Condor Pool
This static information was updated 3/3/2009. The pool is currently maintained by Jonny Smith.
We would like to invite you to join a Cockcroft condor system. A system which would make available your un-used desk top computational power to fellow scientist and engineers in Cockcroft doing computationally intensive calculations!
Within the Cockcroft Institute there are a number of activities which are computationally very demanding. These include simulations that track particles around the NLS, microbunching studies, calculations of RF modes in cavities, and wakefield calculations. We determined that in order to achieve our goals we needed to expand the computing hardware available to us, especially as we foresee our computational requirements growing. We have negotiated access to a number of large clusters of machines in CSE, Lancaster, Liverpool and Manchester, but we have been encouraged to make full use of those resources already available to us, and identified the best solution with help from the e-science department of STFC.
What is Condor?
Quite often our desktop computers sit idle when we are not using them, and have no computations running on them. Condor is an application that takes these spare computing cycles, and makes them appear as a cluster for our own use. It is configured so that if a condor managed task is running and you return to your computer, this task will completely vacate your computer and find somewhere else to run.
I have successfully run ABCI, a wakefield calculation program, and others various test programs on a collection of computers using condor as a test, and various others are looking at it managing a queue of Opera jobs, particle tracking simulations, etc. We are now upgrading to a 'production environment'. By this I mean encouraging everyone with a computer in the CI on the DL internal network to join, although I should add you will need to be an Administrator on the computer for the setup to work properly. Unfortunately computers on the visitor network do not have the right access through the firewall.
condor_status -pool ci-condor.dl.ac.uk
Name OpSys Arch State Activity LoadAv Mem ActvtyTime
ci-condor.dl. LINUX INTEL Unclaimed Idle 0.000 1009 0+01:25:04
slot1@dlccrof LINUX INTEL Owner Idle 1.000 500 0+22:47:47
slot2@dlccrof LINUX INTEL Owner Idle 2.440 500 0+10:06:17
slot1@dlccrof LINUX INTEL Owner Idle 1.000 2024 4+22:12:35
slot2@dlccrof LINUX INTEL Unclaimed Idle 0.000 2024 0+06:25:09
dlccroft3.dl. LINUX INTEL Unclaimed Idle 0.000 1002 0+05:28:22
slot1@dlccrof LINUX INTEL Unclaimed Idle 0.190 504 0+08:00:08
slot2@dlccrof LINUX INTEL Unclaimed Idle 0.000 504102+22:16:02
slot1@kvg9122 LINUX INTEL Owner Idle 0.400 1961 0+05:05:08
slot2@kvg9122 LINUX INTEL Unclaimed Idle 0.000 1961 0+12:05:16
slot1@ycg3488 LINUX INTEL Unclaimed Idle 0.030 492 0+08:20:09
slot2@ycg3488 LINUX INTEL Unclaimed Idle 0.000 492 0+23:15:28
jda23eve1.dl. LINUX X86_64 Owner Idle 1.000 3017 6+18:22:13
slot1@DLCCROF WINNT51 INTEL Owner Idle 0.550 1022 0+05:00:07
slot2@DLCCROF WINNT51 INTEL Owner Idle 0.000 1022 0+05:10:08
apws16.dl.ac. WINNT51 INTEL Unclaimed Idle 0.000 1023 0+05:00:03
slot1@apws24. WINNT51 INTEL Owner Idle 1.000 1663 1+00:30:21
slot2@apws24. WINNT51 INTEL Owner Idle 1.000 1663 1+00:30:22
slot1@djd63vi WINNT51 INTEL Owner Idle 0.570 1023 0+06:10:08
slot2@djd63vi WINNT51 INTEL Owner Idle 0.000 1023 0+06:10:09
slot1@dlccrof WINNT51 INTEL Unclaimed Idle 0.000 1790 0+06:20:08
slot2@dlccrof WINNT51 INTEL Unclaimed Idle 0.000 1790 0+06:40:10
slot1@dlccrof WINNT51 INTEL Unclaimed Idle 0.000 1790 0+11:25:16
slot2@dlccrof WINNT51 INTEL Unclaimed Idle 0.010 1790 0+08:50:12
slot1@dlccrof WINNT51 INTEL Unclaimed Idle 0.000 1658 0+19:30:25
slot2@dlccrof WINNT51 INTEL Unclaimed Idle 0.000 1658 0+07:30:09
slot2@dlccrof WINNT51 INTEL Unclaimed Idle 0.000 510 0+05:35:09
slot1@dlccrof WINNT51 INTEL Owner Idle 0.000 1663 0+18:40:11
slot2@dlccrof WINNT51 INTEL Owner Idle 0.210 1663 0+19:10:12
slot1@gcb53vi WINNT51 INTEL Unclaimed Idle 0.010 1535 0+05:20:08
slot2@gcb53vi WINNT51 INTEL Unclaimed Idle 0.010 1535 0+05:40:13
slot1@jac93vi WINNT51 INTEL Owner Idle 0.000 1010 0+05:25:08
slot2@jac93vi WINNT51 INTEL Owner Idle 0.030 1010 0+05:35:09
slot2@lbj37vi WINNT51 INTEL Owner Idle 0.000 1662 0+05:15:08
slot1@pg45vig WINNT51 INTEL Owner Idle 0.310 511 0+05:05:08
slot2@pg45vig WINNT51 INTEL Owner Idle 0.000 511 0+05:05:09
slot3@pg45vig WINNT51 INTEL Owner Idle 0.000 511 0+05:05:10
slot4@pg45vig WINNT51 INTEL Owner Idle 0.000 511 0+05:05:11
slot1@rfsim1. WINNT51 INTEL Unclaimed Idle 0.000 383 0+06:55:08
slot2@rfsim1. WINNT51 INTEL Unclaimed Idle 0.000 383 0+17:00:27
slot3@rfsim1. WINNT51 INTEL Unclaimed Idle 0.000 2302 0+22:15:34
slot1@rfsim2. WINNT51 INTEL Unclaimed Idle 0.000 3070 0+05:45:07
slot1@rfsim4. WINNT51 INTEL Claimed Busy 0.010 1534 5+00:06:01
slot2@rfsim4. WINNT51 INTEL Unclaimed Idle 0.280 511 0+05:09:52
slot1@rf64sim WINNT52 INTEL Unclaimed Idle 0.020 65533 0+08:40:07
Total Owner Claimed Unclaimed Matched Preempting Backfill
INTEL/LINUX 12 4 0 8 0 0 0
INTEL/WINNT51 31 15 1 15 0 0 0
INTEL/WINNT52 1 0 0 1 0 0 0
X86_64/LINUX 1 1 0 0 0 0 0
Total 45 20 1 24 0 0 0
Up to 60 cores may be available at times.
ASTeC Orion-Galaxy Cluster
A 96-core x86 cluster, mainly 1GHz processors running Linux. For access contact Jonny Smith e-mail firstname.lastname@example.org
Linux Condor Instructions (Scientific Linux preliminary)
Get the RPM - there's a RHEL5 ones at
depending on your flavour.
As root do "yum localinstall condor...rpm"
setup firewall - condor we're set for ports 9614, 9618 and 65000-65255 all both TCP and UDP. This can be done through the GUI. Some users had firewalls on SL, others did not.
setup condor user and group - can be done again using GUI, or just run the following as root
/usr/sbin/groupadd -g 14168 condor
/usr/sbin/useradd -g 14168 -u 14168 condor
#You may have a better way.
#There is a sample condor_config and condor_config.local which should work already set up so ...
scp email@example.com/scratch/jda23/escience/condor_config1 /opt/condor-7.0.5/etc/condor_config
#and you'll have to replace $hostname with whatever the rpm has set up in the commands below.
scp firstname.lastname@example.org/scratch/jda23/escience/condor_config.local1 /opt/condor-7.0.5/local.$HOSTNAME/condor_config.local
#deal with permissions in these directories by going
chown -R condor:condor *
#This should basically do it, but to be complete we'd like to have condor start as a service.
# There is an example init.d file in /opt/condor-7.0.5/etc/examples/condor.init
cp etc/examples/condor.init /etc/rc.d/init.d/condor
chmod a+x /etc/rc.d/init.d/condor
# put the executable somewhere where it might be expected.
ln -s /opt/condor-7.0.5/sbin/condor_master /usr/sbin/condor_master
# and the configuration
ln -s /opt/condor-7.0.5/condor.sh /etc/sysconfig/condor
There's almost certianly a better way of adding it to the appropriate runlevels than this, but it's what I've done so far...
ln -s /etc/rc.d/init.d/condor /etc/rc.d/rc5.d/S96condor
(there's a tool to add an S to levels 2345 and kill to 0,1 and 6, no? service add condor or something?)
At this point you probably want to ensure users and root have this file sourced in .bashrc (or .cshrc) so edit the .bashrc with
system-config-services should list condor and be able to start it, although it reports as dead even when it's on.
Watch out for the following:
Condor is particular about the /etc/hosts file. Requires a more conventional layout rather than the SL default. I think the default puts the system name against 127.1.0.0 rather than the system network IP address, so this may require modification to the static IP of the host, otherwise the client will tell the central host to look at 127.0.0.1 (which would be the central server) to send messages to in response to adverts, rather than the client which needs the information.
Windows XP Installation
Installation should take about 5 minutes on an average PC. Any user comfortable with running things from a command line should be able to do this themselves, but if not please get in touch Jonny Smith, e-mail: email@example.com
and I'll see if I can do the install myself. The more people donate their unused computer cycles the better the resource is for others. More details on the Condor system can be found here: http://www.cs.wisc.edu/condor/
Instructions for setting up Condor on Windows XP for users on the Daresbury campus network - Linux and Mac users please email me as alternative solutions exist.
Click on the start button and select "run" from the menu type "cmd" in the box Copy
to the clipboard (ctrl-C).
right click on the command prompt window and select paste
A graphical installer should start up.
On the first screen, after accepting the terms and conditions, choose join existing condor pool, and enter ci-condor.dl.ac.uk as the hostname.
Apart from 'start condor service after installation', which you need to set to "NO", which is about the last option, it shouldn't matter which options you select on the subsequent pages of the installer as we will overwrite the configuration file with one with all the right settings.
You'll be presented with a choice of Custom or Install, select Install to install condor in its default loaction.
When it's finished putting files where they need to go, click finish.
Copy the following line and paste into the command prompt window.
copy \\astecnas\users\jda23 escience\condor_etc\condor_config
This updates the configuration settings to those we are using for the Cockcroft Condor pool.
Then type this line on your command prompt.
net start condor
You have now installed condor and started the service. You should close the window.
If you wish to test it, I would encourage you to add the executable files to your system PATH variable. This is found by right clicking on 'My computer'->properties->Advanced(tab)->Environment variables, scroll the system variables window down to PATH and click on it then click edit under the box at the end add ;c:\condor\bin Now open another command prompt as you did before. Try typing "condor_status" this should tell you about the status of all nodes currently in the pool (as listed above). You can also see which ones would be available for running jobs at the moment with "condor_status -avail". You can sort these by those with the fastest processors: "condor_status -avail -sort Mips".
Various guides to submission resources are available on the Web
There are also some links on the NW-GRID portal here:
Please contact Rob Allan to get an account, e-mail: firstname.lastname@example.org