This configuration has been tested with tftp-server-0.42-3.1.el5.centos, "4.0.2" version of systemimager, "2.2.11" version of systemconfigurator, "1.64" version of perl-AppConfig, and "0.62" version of perl-File-HomeDir.

Have a look at http://wiki.systemimager.org/index.php/Quick_Start_HOWTO.

On dulak-server

Download "systemimager-*noarch.rpm" from http://sourceforge.net/project/showfiles.php?group_id=259, "systemconfigurator-*noarch.rpm" http://sourceforge.net/project/showfiles.php?group_id=24006, perl-AppConfig from http://dag.wieers.com/rpm/packages/perl-AppConfig, and perl-File-HomeDir from http://dag.wieers.com/rpm/packages/perl-File-HomeDir to dulak-server and "Golden Client":

  • make sure that yum-utils is installed:

    yum install yum-utils
  • go to the external RPMS directory:

    cd /home/dulak-server/rpm/external
  • try first:

    yumdownloader --resolve perl-AppConfig
  • if this fails, install RPM's downloaded from the web and:

    yum localinstall --nogpgcheck perl-AppConfig-*.noarch.rpm perl-File-HomeDir-*.noarch.rpm

    download remaining packages.

  • intall RPM's:

    yum localinstall --nogpgcheck systemconfigurator-* systemimager-common-* systemimager-server-* \
                                  systemimager-i386boot-standard-* systemimager-x86_64boot-standard-*
  • run (this will create /root/dhcpd.conf file which after some editing will serve as dhcp configuration file):

    cd
    si_mkdhcpserver

    Answer in the following way:

    [root@dulak-server systemimager]# si_mkdhcpserver
    
     Continue? (y/[n]): y
    
    
     Type your response or hit <Enter> to accept [defaults].  If you don't
     have a response, such as no first or second DNS server, just hit
     <Enter> and none will be used.
    
     What is your DHCP daemon major version number (2 or 3)? [3]: 3
     What is the name of your DHCP daemon config file? [/etc/dhcpd.conf]: /root/dhcpd.conf
     What is your domain name? []: dulak-cluster.fysik.dtu.dk
     What is your network number? [192.168.1.0]: 10.3.0.0
     What is your netmask? [255.0.0.0]: 255.255.255.0
     What is the starting IP address for your dhcp range? [10.3.0.1]: 10.3.0.100
     What is the ending IP address for your dhcp range? [10.3.0.254]: 10.3.0.200
     What is the IP address of your first DNS server? []: 10.3.0.2
     What is the IP address of your second DNS server? []:
     What is the IP address of your default gateway? [10.3.0.254]: 10.3.0.2
     What is the IP address of your image server? [10.3.0.254]: 10.3.0.2
     What is the IP address of your boot server? [10.3.0.254]: 10.3.0.2
     What is the IP address of your log server? []: 10.3.0.2
     If your log server uses a non-standard port, enter it here: []:
     Use tmpfs staging on client?  (If unsure, choose "n") [n]:
     Do you want to use Flamethrower (multicast) to install your clients? [n]:
    
     What... is the air-speed velocity of an unladen swallow? []:
    
     Wrong!!! (with a Monty Python(TM) accent...)
    
     Press <Enter> to continue...
    
    
     Ahh, but seriously folks...
     Here are the values you have chosen:
    
     #######################################################################
     ISC DHCP daemon version:                  3
     ISC DHCP daemon config file:              /root/dhcpd.conf
     DNS domain name:                          dulak-cluster.fysik.dtu.dk
     Network number:                           10.3.0.0
     Netmask:                                  255.255.255.0
     Starting IP address for your DHCP range:  10.3.0.100
     Ending IP address for your DHCP range:    10.3.0.200
     First DNS server:                         10.3.0.2
     Second DNS server:
     Third DNS server:
     Default gateway:                          10.3.0.2
     Image server:                             10.3.0.2
     Boot server:                              10.3.0.2
     Log server:                               10.3.0.2
     Log server port:
     Flamethrower directory port:
     Use tmpfs staging on client:              n
     SSH files download URL:
     #######################################################################
    
     Are you satisfied? (y/[n]): y
    
    
     Would you like me to restart your DHCP server software now? (y/[n]):
     Would you like me to restart your DHCP server software now? (y/[n]):

    Now, edit /root/dhcpd.conf to include fixed-address definitions of your compute nodes:

    subnet 10.3.0.0 netmask 255.255.255.0 {
      #  range  10.3.0.100 10.3.0.200;
      option domain-name "dulak-cluster.fysik.dtu.dk";
      option domain-name-servers 10.3.0.2;
      option routers 10.3.0.2;            # Fake default gateway
      #
     }
     # Cluster nodes
     # host definitions must be outside local scopes
     # https://bugzilla.redhat.com/show_bug.cgi?id=449192
     #
     host n001 { hardware ethernet 00:06:5b:01:c2:c1; fixed-address n001.dulak-cluster.fysik.dtu.dk;}
     host n002 { hardware ethernet 00:06:5b:01:b0:6c; fixed-address n002.dulak-cluster.fysik.dtu.dk;}
     host n003 { hardware ethernet 00:06:5b:01:c2:c7; fixed-address n003.dulak-cluster.fysik.dtu.dk;}
     host n004 { hardware ethernet 00:06:5b:01:b5:83; fixed-address n004.dulak-cluster.fysik.dtu.dk;}

    Use this as dhcp configuration file:

    cp /root/dhcpd.conf /etc/dhcpd.conf
    service dhcpd restart
  • comment out "/var/lib/nfs/*" line in /etc/systemimager/getimage.exclude file (present in systemimager-3.8.2). See why.

  • make sure that tftp-server is installed:

    yum install tftp-server
  • edit /etc/xinetd.d/tftp:

    disable=no
    server_args             = -s /tftpboot -r blksize -v
  • restart xinted:

    service xinetd restart

The steps below on this page need to be postponed until you install "Golden Client". Now, go to configuring NFS.

On "Golden Client"

  • install on "Golden Client":

    cd /home/dulak-server/rpm
    rpm -ivh systemconfigurator-* systemimager-common-* systemimager-client-* \
             systemimager-i386initrd_template-* perl-AppConfig-*.noarch.rpm perl-File-HomeDir-*.noarch.rpm
  • run:

    si_prepareclient --server dulak-server.dulak-cluster.fysik.dtu.dk

On dulak-server

  • download exclude.getimage to /root, and run:

    si_getimage -golden-client n001.dulak-cluster.fysik.dtu.dk -image n001.dulak-cluster \
                -ip-assignment dhcp -exclude-file /root/exclude.getimage -post-install reboot
  • create an alias for later use when updating the image (put the following line into ~/.bashrc):

    alias getn001='si_getimage -golden-client n001.dulak-cluster.fysik.dtu.dk -image n001.dulak-cluster \
                  -ip-assignment dhcp -exclude-file /root/exclude.getimage -post-install reboot \
                  -update-script NO'
  • execute the following commands:

    cd /var/lib/systemimager/images/n001.dulak-cluster/etc/systemimager/boot
    cp kernel /tftpboot/kernel.n001.dulak-cluster
    cp initrd.img /tftpboot/initrd.img.n001.dulak-cluster
    cp /etc/systemimager/pxelinux.cfg/message.txt /tftpboot
  • create /tftpboot/pxelinux.cfg/default.node_install.n001.dulak-cluster file:

    DEFAULT systemimager
    LABEL systemimager
    DISPLAY message.txt
    PROMPT 1
    TIMEOUT 50
    KERNEL kernel.n001.dulak-cluster
    APPEND vga=extended initrd=initrd.img.n001.dulak-cluster root=/dev/ram ramdisk_size=65536 \
           tmpfs_size=2500M ramdisk_blocksize=1024
  • and continue (creating links from the current directory is important!):

    chmod go+r  /tftpboot/pxelinux.cfg/default.node_install.n001.dulak-cluster
  • if you want the cloning of the nodes to become the default PXE option (not recommended) do:

    cd /tftpboot/pxelinux.cfg/
    rm default
    ln -s default.node_install.n001.dulak-cluster default
  • run:

    si_addclients --hosts n001-n004 --domainname dulak-cluster.fysik.dtu.dk --script n001.dulak-cluster

    Answer (necessary in version "3.8.2"):

    I will work with hostnames:  n001-n004
           in the domain:  dulak-cluster.fysik.dtu.dk
    
    Are you satisfied? (y/[n]): y
    
    can be autoinstalled with one of the available images? ([y]/n): y
    
    I will ask you for your clients' IP addresses one subnet at a time.
    Would you like me to continue? (y/[n]): n
  • start rsync daemon:

    /etc/init.d/systemimager-server-rsyncd restart
  • unfortunately you must (14 Apr 2009) make sure that SElinux is in permissive mode - edit /etc/selinux/config to contain:

    SELINUX=permissive
    SELINUXTYPE=targeted
  • if not yet in, switch the system into permissive mode (see http://www.crypt.gen.nz/selinux/disable_selinux.html, note that you can switch back to enforcing mode by echo 1 >/selinux/enforce) and forcing filesystem relabel after reboot:

    echo 0 >/selinux/enforce
    touch /.autorelabel
  • create /var/lib/systemimager/scripts/post-install/90all.autorelabel script (it will relabel a freshly cloned node's filesystem into SElinux right mode):

    #!/bin/sh
    #
    # To be used with the pxeconfig tool.
    # Description: SElinux autorelabel.
    #
    
    # Load installation variables.
    [ -e /tmp/post-install/variables.txt ] && . /tmp/post-install/variables.txt
    
    if [ -f /etc/SuSE-release ]; then
        # SuSE
        echo "Not implemented yet"
    elif [ -f /etc/redhat-release ]; then
        # RedHat / Fedora
        touch /.autorelabel
    elif [ -f /etc/debian_version ]; then
        # Debian
        echo "Not implemented yet"
    fi
  • make the script executable:

    chmod go+x /var/lib/systemimager/scripts/post-install/90all.autorelabel
  • create /var/lib/systemimager/scripts/post-install/30all.pxeconfig script (removes the <hex_ipaddr> file from the /tftboot/pxelinux.cfg` directory so the client will boot from disk.):

    #!/bin/sh
    #
    # To be used with the pxeconfig tool.
    # Remove the <hex_ipaddr> file from the pxelinux.cfg directory so the client will boot from disk.
    
    # Load installation variables.
    [ -e /tmp/post-install/variables.txt ] && . /tmp/post-install/variables.txt
    telnet $IMAGESERVER 6611
    sleep 1
    exit 0
  • reboot:

    shutdown -r now&

Installing a fresh node

Note do not install more than one node now - just one - this will allow you test parallel jobs! There will be additional settings on the nodes during benchmarking and maintenance.!

On "dulak-server", configure node n002 to be installed using default.node_install.n001.dulak-cluster image:

pxeconfig n002 -f default.node_install.n001.dulak-cluster

and get the current "Golden Client" image (you must run this command after every change on the "Golden Client"):

getn001 # answer y 4 times

On the n002 node to be cloned using default.node_install.n001.dulak-cluster:

  • In BIOS set the system to boot first from the hard disk. When the cloned node starts boot from the network manually. Systemimager should install the node using "Golden client" image, and after boot from the hard disk.

  • If you did not create the /var/lib/systemimager/scripts/post-install/90all.autorelabel script on dulak-server, when log in into a freshly cloned node as root in the graphical mode you may get:

    session_child_run:: could not exec /etc/X11/xinit/session default
  • so, log in in the text mode and do (this is what /var/lib/systemimager/scripts/post-install/90all.autorelabel does):

    touch /.autorelabel; reboot

Congratulations: your setup is ready for testing - go to benchmarking and maintenance.

Troubleshooting

The information below has been gathered during the tries to keep SElinux enforcing, and is only relevant if you want to try to experiment with systemimager with SElinux enabled. You may encounter errors during the installation of the cloned node:

  • if you get:

    Oct  9 18:50:31 dulak-server rsyncd[3462]: rsync: failed to open log-file /var/log/systemimager/rsyncd: Permission denied (13)

    Do:

    Unknown
  • if you get:

    Oct  9 19:09:36 dulak-server setroubleshoot:      SELinux is preventing /usr/bin/rsync (rsync_t) "create" access to <Unknown> (rsync_t).      For complete SELinux messages. run sealert -l 2e645de0-d434-49ca-91ff-6395d6fea367
    Oct  9 19:09:36 dulak-server setroubleshoot:      SELinux is preventing /usr/bin/rsync (rsync_t) "create" access to <Unknown> (rsync_t).      For complete SELinux messages. run sealert -l 2e645de0-d434-49ca-91ff-6395d6fea367

    Do:

    Unknown
  • if you get:

    Oct  9 17:09:34 dulak-server rsyncd[3580]: rsync: link_stat "/i386/standard/boel_binaries.tar.gz" (in boot) failed: Permission denied (13)
    Oct  9 17:09:34 dulak-server rsyncd[3580]: rsync error: some files could not be transferred (code 23) at main.c(615) [sender=2.6.8]
    Oct  9 19:09:36 dulak-server setroubleshoot:      SELinux is preventing rsync (/usr/bin/rsync) "getattr" to /i386/standard/boel_binaries.tar.gz (usr_t).      For complete SELinux messages. run sealert -l c600bb88-a1d5-46ec-aed6-b039e797a5bf

    Do:

    chcon -t rsync_data_t /usr/share/systemimager/boot/i386/standard/boel_binaries.tar.gz
  • if you get:

    Oct 10 11:24:51 dulak-server rsyncd[10421]: rsync: chroot /var/lib/systemimager/scripts failed: Permission denied (13)
    Oct 10 11:24:53 dulak-server setroubleshoot:      SELinux is preventing rsync (/usr/bin/rsync) "search" to lib (var_lib_t).      For complete SELinux messages. run sealert -l 97f69c1b-87ac-42d0-a066-131f2ec11556

    Do:

    chcon -t rsync_data_t /var/lib
    chcon -R -h -t rsync_data_t /var/lib/systemimager
  • if you get (lots of similar):

    Oct 10 12:43:27 dulak-server rsyncd[10635]: rsync: opendir "/var/lib/dav" (in n001.dulak-cluster) failed: Permission denied (13)

    Do:

    Unknown

Niflheim: Building_a_Cluster_-_Tutorial/installing_and_configuring_systemimager (last edited 2010-11-04 12:58:53 by OleHolmNielsen)