LabResources:LabBoot
From Adapt
Directions for restarting the entire lab
Overview of dependancies
Most services in the lab are designed to start with no external dependancies. Any 'production' service has been designed to start standalone with minimal to no external dependancies. The exception to this is the reliance on the SAN components and the srb masters that depend on a local or remote mcat.
Boot Order
- San switch (5200 in nara1) :
- ACNC Raid (1 chassis), <nop>FastT (2 chassis), AX-150. Wait until lights on all devices settle before continuing. :
- App services. These can boot in any order. :
- narafs01
- No extra startup required, all services start at boot (runs nfs, samba)
- narawww01
- No extra startup required, all services start at boot (runs httpd, mysql)
- naraapp01 (narasrb01)
- restart mcat [[#NaraSrb01Mcat][Directions]]
- naraapp03 (chronopolis-mcat)
- restart mcat [[#ChronopolisMcat][Directions]]
- restart monitoring services [[#ChronopolisMonitor][Directions]]
- naraapp04 (narapawn)
- restart pawn services TODO
- naraapp06 vmware server
- No extra startup required, all services start at boot (runs vmware)
- narafs01
- Slave SRB Services dependancy listed :
- naraapp02 (narasrb02) - requires erasrb01.nara.gov mcat to be running
- Check for service by =telnet erasrb01.nara.gov 7618=
- restart srb master [[#NaraSrb02][Directions]]
- naraapp05 (narasrb03) - requires narasrb01 mcat to be running
- restart srb master [[#NaraSrb03][Directions]]
- naraapp02 (narasrb02) - requires erasrb01.nara.gov mcat to be running
- VMWare services. Use vmware console, login as root _require naraapp06 to boot_ :
- lcpawn
- restart NDIIPP pawn services TODO
- fedoravm01 (Optional)
- restart Fedora TODO
- naradev07
- No extra startup required, all services start at boot
- eramonitor
- Restart era srb monitor. [[#EraMonitor][Directions]]
- pawnrecv01
- restart Chronopolis pawn services TODO
- narawks05
- WinXp workstation, No extra startup required, all services start at boot
- lcpawn
- Non-critical devel services :
- naradev01..05 boot anytime (narafs01 nice, but not required). These may be running various demo's that users can restart.
- naradev06 see Fritz or Van Opst
MCAT on naraapp01
Run this as user *mcat*. This service is the nara-umiacs zone in the persistant archive prototype
[mcat@naraapp01 ~]$ cd /export/srb/srb/ [mcat@naraapp01 srb]$ perl install.pl start ... ... [mcat@naraapp01 srb]$ Sinit -v Using Port 7618. Client Release = SRB-3.3.1, API version = G. Server Release = SRB-3.4.1, API version = G. Client mcatZone = nara-umiacs Server mcatZone = nara-umiacs
The last Sinit will verify the mcat is up and running.
MCAT on naraapp03
This service runs as user *mcat*. This is the chronopolis-umiacs zone for Chronopolis.
[mcat@naraapp03 ~]$ cd /export/srb/srb/ [mcat@naraapp03 srb]$ perl install.pl start This script is install.pl version 3.x, last updated February 21, 2006 This host is chronopolis-mcat.umiacs.umd.edu This host full network name is chronopolis-mcat.umiacs.umd.edu This host full network address is chronopolis-mcat.umiacs.umd.edu Your home directory is /export/homes/mcat Note: SRB_DIR determined to be SRB3_4_1 running: /export/srb/srb/pgsql/bin/pg_ctl start -o '-i' -l /export/srb/srb/pgsql/data/pgsql.log chdir'ing to: /export/srb/srb/SRBInstall/bin sleeping a second running: ./runsrb mv: cannot stat `./../data/srbLog': No such file or directory ./runsrb: line 246: /bin/rm: Argument list too long findServerExec: found "/export/srb/srb/SRBInstall/bin/./srbServer" using argv[0] logFile: ../data/log/srbLog.9.6.6 opened successfully. running: ps -el | grep srb | grep -v grep If the srb server is running OK, you should see srbMaster and srbServer here: 1 S 10098 559 1 0 77 0 - 2858 -? 00:00:00 srbMaster-3.4.1 0 S 10098 560 559 0 76 0 - 2778 -? 00:00:00 srbFileChk 0 Z 10098 563 559 1 76 0 - 0 exit ? 00:00:00 srbServer <defunct> 0 S 10098 564 559 0 76 0 - 3401 pipe_w ? 00:00:00 srbServer chdir'ing to: ../.. Done starting Postgres and SRB servers at install.pl line 525. [mcat@naraapp03 srb]$ Sinit -v Using default Port 7618. Client Release = SRB-3.4.1, API version = G. Server Release = SRB-3.4.1, API version = G. Client mcatZone = chronopolis-umiacs Server mcatZone = chronopolis-umiacs
The last Sinit shows a working mcat.
Replication Monitor for chronoplis
Depends on mysql running, this should start at boot. The monitor runs in tomcat as user naraapp.
[naraapp@naraapp03 ~]$ cd apache-tomcat-5.5.20 [naraapp@naraapp03 ~/apache-tomcat-5.5.20]$ bin/startup.sh Using CATALINA_BASE: /export/homes/naraapp/apache-tomcat-5.5.20 Using CATALINA_HOME: /export/homes/naraapp/apache-tomcat-5.5.20 Using CATALINA_TMPDIR: /export/homes/naraapp/apache-tomcat-5.5.20/temp Using JRE_HOME: /opt/jdk1.5.0 [naraapp@naraapp03 ~/apache-tomcat-5.5.20]$
Wait a minute and connect to [[1][2]]
SRB Master on naraapp02
This services runs as user *mcat*. This required that the MCAT on erasrb01.nara.gov is up and running. App02 also runs a standalone mcat for an sdsc ggf/wwd demo as user naraapp. There are no dependancies for this demo.
SRB Master for erasrb01
[mcat@naraapp02 ~]$ cd SRBInstall/bin/ [mcat@naraapp02 bin]$ ./runsrb mv: cannot stat `./../data/srbLog': No such file or directory rm: cannot remove `./../data/lockDir/.[a-z]*': No such file or directory rm: cannot remove `./../data/lockDir/CVS': Is a directory findServerExec: found "/export/homes/mcat/SRBInstall/bin/./srbServer" using argv[0] logFile: ../data/log/srbLog.9.6.6 opened successfully. resource: narasrb02, storSysType: 0, vaultPath: /export/vault01/vault Local Zone : ZoneName = nara HostName = erasrb01.nara.gov PortNum = 7618 Remote Zone : ZoneName = nara-dc HostName = 207.245.162.200 PortNum = 7618 ZoneName = nara-gtri HostName = bush41.gtri.gatech.edu PortNum = 7618 ZoneName = nara-sdsc HostName = srb-mcat.sdsc.edu PortNum = 7618 ZoneName = nara-umiacs HostName = narasrb01.umiacs.umd.edu PortNum = 7618 NOTICE:Sep 6 13:17:47: srbMaster version SRB-3.4.1&G is up. findServerExec: found "/export/homes/mcat/SRBInstall/bin/./srbServer" using argv[0] naraapp 11688 1 0 Aug13 ? 00:00:00 ./srbMaster-3.4.0 -d 1 -S mcat 16547 1 0 13:17 ? 00:00:00 ./srbMaster-3.4.1 -d 1 -S mcat 16562 16543 0 13:17 pts/1 00:00:00 grep srbMaster
WWD Demo
[naraapp@naraapp02 ~]$ cd /export/wwd/srb/ [naraapp@naraapp02 srb]$ perl install.pl start This script is install.pl version 3.x, last updated February 21, 2006 This host is narasrb02.umiacs.umd.edu This host full network name is narasrb02.umiacs.umd.edu This host full network address is narasrb02.umiacs.umd.edu Your home directory is /export/homes/naraapp Note: SRB_DIR determined to be SRB3_4_0 running: /export/wwd/srb/pgsql/bin/pg_ctl start -o '-i' -l /export/wwd/srb/pgsql/data/pgsql.log chdir'ing to: /export/wwd/srb/SRBInstall/bin sleeping a second running: ./runsrb mv: cannot stat `./../data/srbLog': No such file or directory rm: cannot remove `./../data/lockDir/CVS': Is a directory findServerExec: found "/export/wwd/srb/SRBInstall/bin/./srbServer" using argv[0] logFile: ../data/log/srbLog.9.6.6 opened successfully. running: ps -el | grep srb | grep -v grep If the srb server is running OK, you should see srbMaster and srbServer here: 1 S 10098 16547 1 0 76 0 - 2627 -? 00:00:00 srbMaster-3.4.1 0 S 10098 16550 16547 0 76 0 - 3092 -? 00:00:00 srbServer 1 S 10292 17285 1 0 76 0 - 7036 -? 00:00:00 srbMaster-3.4.0 chdir'ing to: ../.. Done starting Postgres and SRB servers at install.pl line 525. [naraapp@naraapp02 srb]$ Sinit -v Using default Port 7619. Client Release = SRB-3.4.0, API version = G. Server Release = SRB-3.4.0, API version = G. Client mcatZone = umiacs Server mcatZone = umiacs
MCAT on naraapp05
This host runs an srb master for narasrb01 (user mcat). This required that the MCAT on naraapp01 is up and running to start the srb master.
SRB Master
As user mcat.
[mcat@naraapp05 ~]$ cd SRBInstall/bin/ [mcat@naraapp05 bin]$ ./runsrb mv: cannot stat `./../data/srbLog': No such file or directory rm: cannot remove `./../data/lockDir/.[a-z]*': No such file or directory rm: cannot remove `./../data/lockDir/CVS': Is a directory findServerExec: found "/export/homes/mcat/SRBInstall/bin/./srbServer" using argv[0] logFile: ../data/log/srbLog.9.6.6 opened successfully. resource: narasrb03-unix, storSysType: 0, vaultPath: /export/vault01/vault Local Zone : ZoneName = nara-umiacs HostName = narasrb01.umiacs.umd.edu PortNum = 7618 Remote Zone : ZoneName = nara HostName = erasrb01.nara.gov PortNum = 7618 ZoneName = nara-dc HostName = 207.245.162.200 PortNum = 7618 ZoneName = nara-gtri HostName = bush41.gtri.gatech.edu PortNum = 7618 ZoneName = nara-sdsc HostName = srb-mcat.sdsc.edu PortNum = 7618 NOTICE:Sep 6 13:26:24: srbMaster version SRB-3.4.1&G is up. findServerExec: found "/export/homes/mcat/SRBInstall/bin/./srbServer" using argv[0] mcat7920 1 0 13:26 ? 00:00:00 ./srbMaster-3.4.1 -d 1 -S mcat7933 7916 0 13:26 pts/0 00:00:00 grep srbMaster
Replication monitor on eramonitor
There are two pieces that run, first is mysql configured to run at boot out. It's data directory is /export/monitor/mysql. Second is tomcat running as user naraapp that runs the monitoring software.
[naraapp@eramonitor ~]$ ps -ef|grep my root 14644 1 0 11:46 pts/0 00:00:00 /bin/sh /usr/bin/mysqld_safe --defaults-file=/etc/my.cnf --pid-file=/var/run/mysqld/mysqld.pid mysql 14677 14644 8 11:46 pts/0 00:00:00 /usr/libexec/mysqld --defaults-file=/etc/my.cnf --basedir=/usr --datadir=/export/monitor/mysql --user=mysql --pid-file=/var/run/mysqld/mysqld.pid --skip-locking --socket=/var/lib/mysql/mysql.sock naraapp 14698 14437 0 11:46 pts/0 00:00:00 grep my [naraapp@eramonitor ~]$ cd /export/monitor/srb-replication/jakarta-tomcat-5.5.9/ [naraapp@eramonitor jakarta-tomcat-5.5.9]$ bin/startup.sh Using CATALINA_BASE: /export/monitor/srb-replication/jakarta-tomcat-5.5.9 Using CATALINA_HOME: /export/monitor/srb-replication/jakarta-tomcat-5.5.9 Using CATALINA_TMPDIR: /export/monitor/srb-replication/jakarta-tomcat-5.5.9/temp Using JRE_HOME: /opt/jdk1.5.0_07
To test, wait a few minutes, then connect to http://eramonitor.umiacs.umd.edu:8080/srb-monitor