Personal tools

LabResources:LabBoot

From Adapt

Revision as of 20:47, 11 September 2008 by Scsong (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Directions for restarting the entire lab

Overview of dependancies

Most services in the lab are designed to start with no external dependancies. Any 'production' service has been designed to start standalone with minimal to no external dependancies. The exception to this is the reliance on the SAN components and the srb masters that depend on a local or remote mcat.

Boot Order

  1. San switch (5200 in nara1) :
  2. ACNC Raid (1 chassis), <nop>FastT (2 chassis), AX-150. Wait until lights on all devices settle before continuing. :
  3. App services. These can boot in any order. :
    • narafs01
      • No extra startup required, all services start at boot (runs nfs, samba)
    • narawww01
      • No extra startup required, all services start at boot (runs httpd, mysql)
    • naraapp01 (narasrb01)
      • restart mcat [[#NaraSrb01Mcat][Directions]]
    • naraapp03 (chronopolis-mcat)
      • restart mcat [[#ChronopolisMcat][Directions]]
      • restart monitoring services [[#ChronopolisMonitor][Directions]]
    • naraapp04 (narapawn)
      • restart pawn services TODO
    • naraapp06 vmware server
      • No extra startup required, all services start at boot (runs vmware)
  4. Slave SRB Services dependancy listed :
    • naraapp02 (narasrb02) - requires erasrb01.nara.gov mcat to be running
      • Check for service by =telnet erasrb01.nara.gov 7618=
      • restart srb master [[#NaraSrb02][Directions]]
    • naraapp05 (narasrb03) - requires narasrb01 mcat to be running
      • restart srb master [[#NaraSrb03][Directions]]
  5. VMWare services. Use vmware console, login as root _require naraapp06 to boot_ :
    • lcpawn
      • restart NDIIPP pawn services TODO
    • fedoravm01 (Optional)
      • restart Fedora TODO
    • naradev07
      • No extra startup required, all services start at boot
    • eramonitor
      • Restart era srb monitor. [[#EraMonitor][Directions]]
    • pawnrecv01
      • restart Chronopolis pawn services TODO
    • narawks05
      • WinXp workstation, No extra startup required, all services start at boot
  6. Non-critical devel services :
    • naradev01..05 boot anytime (narafs01 nice, but not required). These may be running various demo's that users can restart.
    • naradev06 see Fritz or Van Opst

MCAT on naraapp01

Run this as user *mcat*. This service is the nara-umiacs zone in the persistant archive prototype

[mcat@naraapp01 ~]$ cd /export/srb/srb/
[mcat@naraapp01 srb]$ perl install.pl start
...
...
[mcat@naraapp01 srb]$ Sinit -v
Using Port 7618.
Client Release = SRB-3.3.1, API version = G.
Server Release = SRB-3.4.1, API version = G.
Client mcatZone = nara-umiacs
Server mcatZone = nara-umiacs

The last Sinit will verify the mcat is up and running.

MCAT on naraapp03

This service runs as user *mcat*. This is the chronopolis-umiacs zone for Chronopolis.

[mcat@naraapp03 ~]$ cd /export/srb/srb/
[mcat@naraapp03 srb]$ perl install.pl start
This script is install.pl version 3.x, last updated February 21, 2006
This host is chronopolis-mcat.umiacs.umd.edu
This host full network name is chronopolis-mcat.umiacs.umd.edu
This host full network address is chronopolis-mcat.umiacs.umd.edu
Your home directory is /export/homes/mcat
Note: SRB_DIR determined to be SRB3_4_1
running: /export/srb/srb/pgsql/bin/pg_ctl start -o '-i' -l /export/srb/srb/pgsql/data/pgsql.log 
chdir'ing to: /export/srb/srb/SRBInstall/bin
sleeping a second
running: ./runsrb 
mv: cannot stat `./../data/srbLog': No such file or directory
./runsrb: line 246: /bin/rm: Argument list too long
findServerExec: found "/export/srb/srb/SRBInstall/bin/./srbServer" using argv[0]
logFile: ../data/log/srbLog.9.6.6 opened successfully.
running: ps -el | grep srb | grep -v grep 
If the srb server is running OK, you should see srbMaster and srbServer here:
1 S 10098   559     1  0  77   0 -  2858 -?  00:00:00 srbMaster-3.4.1
0 S 10098   560   559  0  76   0 -  2778 -?  00:00:00 srbFileChk
0 Z 10098   563   559  1  76   0 -     0 exit   ?  00:00:00 srbServer <defunct>
0 S 10098   564   559  0  76   0 -  3401 pipe_w ?  00:00:00 srbServer
chdir'ing to: ../..
Done starting Postgres and SRB servers at install.pl line 525.
[mcat@naraapp03 srb]$ Sinit -v
Using default Port 7618.
Client Release = SRB-3.4.1, API version = G.
Server Release = SRB-3.4.1, API version = G.
Client mcatZone = chronopolis-umiacs
Server mcatZone = chronopolis-umiacs

The last Sinit shows a working mcat.

Replication Monitor for chronoplis

Depends on mysql running, this should start at boot. The monitor runs in tomcat as user naraapp.

[naraapp@naraapp03 ~]$ cd apache-tomcat-5.5.20
[naraapp@naraapp03 ~/apache-tomcat-5.5.20]$ bin/startup.sh 
Using CATALINA_BASE:   /export/homes/naraapp/apache-tomcat-5.5.20
Using CATALINA_HOME:   /export/homes/naraapp/apache-tomcat-5.5.20
Using CATALINA_TMPDIR: /export/homes/naraapp/apache-tomcat-5.5.20/temp
Using JRE_HOME: /opt/jdk1.5.0
[naraapp@naraapp03 ~/apache-tomcat-5.5.20]$ 

Wait a minute and connect to [[1][2]]


SRB Master on naraapp02

This services runs as user *mcat*. This required that the MCAT on erasrb01.nara.gov is up and running. App02 also runs a standalone mcat for an sdsc ggf/wwd demo as user naraapp. There are no dependancies for this demo.

SRB Master for erasrb01

[mcat@naraapp02 ~]$ cd SRBInstall/bin/
[mcat@naraapp02 bin]$ ./runsrb
mv: cannot stat `./../data/srbLog': No such file or directory
rm: cannot remove `./../data/lockDir/.[a-z]*': No such file or directory
rm: cannot remove `./../data/lockDir/CVS': Is a directory
findServerExec: found "/export/homes/mcat/SRBInstall/bin/./srbServer" using argv[0]
logFile: ../data/log/srbLog.9.6.6 opened successfully.
resource: narasrb02, storSysType: 0, vaultPath: /export/vault01/vault
Local Zone : 
ZoneName = nara  HostName = erasrb01.nara.gov  PortNum = 7618
Remote Zone : 
ZoneName = nara-dc  HostName = 207.245.162.200  PortNum = 7618
ZoneName = nara-gtri  HostName = bush41.gtri.gatech.edu  PortNum = 7618
ZoneName = nara-sdsc  HostName = srb-mcat.sdsc.edu  PortNum = 7618
ZoneName = nara-umiacs  HostName = narasrb01.umiacs.umd.edu  PortNum = 7618
NOTICE:Sep  6 13:17:47: srbMaster version SRB-3.4.1&G is up.
findServerExec: found "/export/homes/mcat/SRBInstall/bin/./srbServer" using argv[0]
naraapp  11688     1  0 Aug13 ?  00:00:00 ./srbMaster-3.4.0 -d 1 -S
mcat     16547     1  0 13:17 ?  00:00:00 ./srbMaster-3.4.1 -d 1 -S
mcat     16562 16543  0 13:17 pts/1    00:00:00 grep srbMaster

WWD Demo

[naraapp@naraapp02 ~]$ cd /export/wwd/srb/
[naraapp@naraapp02 srb]$ perl install.pl start
This script is install.pl version 3.x, last updated February 21, 2006
This host is narasrb02.umiacs.umd.edu
This host full network name is narasrb02.umiacs.umd.edu
This host full network address is narasrb02.umiacs.umd.edu
Your home directory is /export/homes/naraapp
Note: SRB_DIR determined to be SRB3_4_0
running: /export/wwd/srb/pgsql/bin/pg_ctl start -o '-i' -l /export/wwd/srb/pgsql/data/pgsql.log 
chdir'ing to: /export/wwd/srb/SRBInstall/bin
sleeping a second
running: ./runsrb 
mv: cannot stat `./../data/srbLog': No such file or directory
rm: cannot remove `./../data/lockDir/CVS': Is a directory
findServerExec: found "/export/wwd/srb/SRBInstall/bin/./srbServer" using argv[0]
logFile: ../data/log/srbLog.9.6.6 opened successfully.
running: ps -el | grep srb | grep -v grep 
If the srb server is running OK, you should see srbMaster and srbServer here:
1 S 10098 16547     1  0  76   0 -  2627 -?  00:00:00 srbMaster-3.4.1
0 S 10098 16550 16547  0  76   0 -  3092 -?  00:00:00 srbServer
1 S 10292 17285     1  0  76   0 -  7036 -?  00:00:00 srbMaster-3.4.0
chdir'ing to: ../..
Done starting Postgres and SRB servers at install.pl line 525.
[naraapp@naraapp02 srb]$ Sinit -v
Using default Port 7619.
Client Release = SRB-3.4.0, API version = G.
Server Release = SRB-3.4.0, API version = G.
Client mcatZone = umiacs
Server mcatZone = umiacs

MCAT on naraapp05

This host runs an srb master for narasrb01 (user mcat). This required that the MCAT on naraapp01 is up and running to start the srb master.

SRB Master

As user mcat.

[mcat@naraapp05 ~]$ cd SRBInstall/bin/
[mcat@naraapp05 bin]$ ./runsrb
mv: cannot stat `./../data/srbLog': No such file or directory
rm: cannot remove `./../data/lockDir/.[a-z]*': No such file or directory
rm: cannot remove `./../data/lockDir/CVS': Is a directory
findServerExec: found "/export/homes/mcat/SRBInstall/bin/./srbServer" using argv[0]
logFile: ../data/log/srbLog.9.6.6 opened successfully.
resource: narasrb03-unix, storSysType: 0, vaultPath: /export/vault01/vault
Local Zone : 
ZoneName = nara-umiacs  HostName = narasrb01.umiacs.umd.edu  PortNum = 7618
Remote Zone : 
ZoneName = nara  HostName = erasrb01.nara.gov  PortNum = 7618
ZoneName = nara-dc  HostName = 207.245.162.200  PortNum = 7618
ZoneName = nara-gtri  HostName = bush41.gtri.gatech.edu  PortNum = 7618
ZoneName = nara-sdsc  HostName = srb-mcat.sdsc.edu  PortNum = 7618
NOTICE:Sep  6 13:26:24: srbMaster version SRB-3.4.1&G is up.
findServerExec: found "/export/homes/mcat/SRBInstall/bin/./srbServer" using argv[0]
mcat7920     1  0 13:26 ?  00:00:00 ./srbMaster-3.4.1 -d 1 -S
mcat7933  7916  0 13:26 pts/0    00:00:00 grep srbMaster

Replication monitor on eramonitor

There are two pieces that run, first is mysql configured to run at boot out. It's data directory is /export/monitor/mysql. Second is tomcat running as user naraapp that runs the monitoring software.

[naraapp@eramonitor ~]$ ps -ef|grep my
root     14644     1  0 11:46 pts/0    00:00:00 /bin/sh /usr/bin/mysqld_safe --defaults-file=/etc/my.cnf --pid-file=/var/run/mysqld/mysqld.pid
mysql    14677 14644  8 11:46 pts/0    00:00:00 /usr/libexec/mysqld --defaults-file=/etc/my.cnf --basedir=/usr --datadir=/export/monitor/mysql --user=mysql --pid-file=/var/run/mysqld/mysqld.pid --skip-locking --socket=/var/lib/mysql/mysql.sock
naraapp  14698 14437  0 11:46 pts/0    00:00:00 grep my
[naraapp@eramonitor ~]$ cd /export/monitor/srb-replication/jakarta-tomcat-5.5.9/
[naraapp@eramonitor jakarta-tomcat-5.5.9]$ bin/startup.sh 
Using CATALINA_BASE:   /export/monitor/srb-replication/jakarta-tomcat-5.5.9
Using CATALINA_HOME:   /export/monitor/srb-replication/jakarta-tomcat-5.5.9
Using CATALINA_TMPDIR: /export/monitor/srb-replication/jakarta-tomcat-5.5.9/temp
Using JRE_HOME: /opt/jdk1.5.0_07

To test, wait a few minutes, then connect to http://eramonitor.umiacs.umd.edu:8080/srb-monitor