Linux System Installation via PXE

From DFWLPiki
Jump to: navigation, search

Scope of this Document

This document covers the components and process of building a linux system via network installation. It is assumed the reader already understands the functional concepts of bind, dhcpd, tftpd, httpd, and nfs.

This document applies to all servers listed under Service Providers.

Critical Components

On the PXE Service Provider Server:

System BIOS Support for PXE Boot
DHCP Server
TFTP Server
NFS Export
HTTP Server
Anaconda Configuration File
DNS Entries

Network Considerations:

VLAN dhcp-helper address
Switchport VLAN Configuration

Service Providers

These are the servers that currently provide linux installation services:

Current:

es-nws01.emcent..tech.net
ats-nws01.atlanta..tech.net
psl-labnws01.lab.tech.net

Retiring:

es-nws01.emcent..tech.net

System BIOS Support for PXE Boot

On a Dell server, enter the BIOS by striking the F2 key during the POST process. F2 can be struck at any time before hand off to the boot process, even if the Fkey options have scrolled off the screen. Once in the BIOS, select Integrated Devices, and ensure the Enabled, with PXE is selected for itnerface 1. this is not needed for interface 2 and beyond, it only needs to be set to Enabled. Exit and save.

DHCP Server

The DHCP server has some statements that cause the behavior we need when the PXE process starts.

allow booting;
allow bootp;
class "pxeclients" {
        match if substring(option vendor-class-identifier, 0, 9) = "PXEClient";
        next-server 10.6.246.21;
        filename "linux-install/pxelinux.0";
}

next-server is a tftp-server that is specified as able to server the anaconda bootstrapper to begin the process. this boot strapper will then load the specified linux distribution, or it will pause and show the available list of operating systems.

Each VLAN must have a seperate statement in the dhcpd.conf file.

subnet 10.6.246.0 netmask 255.255.255.0 {
        # Management Systems VLAN - Generally for IT 
        option routers                  10.6.246.1;
        option subnet-mask              255.255.255.0;
        option domain-name              "lab.tech.net";
        option domain-name-servers      10.6.246.21;
        default-lease-time 21600;
        max-lease-time 43200;

        host psl-nws-legacy {
                hardware ethernet       00:50:56:92:67:a2;
                fixed-address           10.6.246.4;
        }
}

In the above example, the subnet is defined, and then options are specified. The 'option domain-name' sets the search domain on the PXE client, this allows it to find the installation host by shortname (which is named in the .cfg file later), and helps with overall process simplification. The 'option domain-name-server' specifies a DNS server, it can be any DNS server that is able to resolve internal hostnames.

Then, there is the individual host statements. The 'host' line has the common shortname of the system. This entry is purely cosmetic, it does not have to be correct, but it *does* have to be unique to any other name in the dhcpd.conf file. 'hardware-ethernet' is the mac address, it can be taken from the BIOS Integrated Devices page, from ifconfig of a live system, or from 'edit options' of a virtual machine. Specify the selected IP address for the system in 'fixed-address, this is generally the permenant IP that has been selected for the system to operate on, but not always. occasionally a system is built on one VLAN, and then moved to another.

TFTP Server

The tftp server doesnt take much to get operational, just install the package tftp-server on the system, and then set:

chkconfig tftp on

Everything for tftp-server is stored in /tftpboot/linux-install by default. For example, you might see this:

drwxr-xr-x 2 root root  4096 Jul  7  2011 msgs
-rw-r--r-- 1 root root 13100 Dec 19  2005 pxelinux.0
drwxr-xr-x 2 root root  4096 Jul 30 16:03 pxelinux.cfg
drwxr-xr-x 2 root root  4096 Feb  7  2012 RHEL5-U3-x86_64

the pxelinux.0 file is the boot strapper that is fed via tftp to the PXE client. pxelinux.cfg folder contains the individual PXE profiles for each target IP address (in hex) that is configured. finally, the other directories are representative of the operating systems that are available to be installed via network (these will be addressed later after the repositories are set up).

NFS Export

We currently use NFS export to serve the operating system files. NFS is not the only method possible, but we use NFS for many other administrative tasks that its just convenient for our purposes.

The file system structure of the repository server is similar to this example. The repository below is located at:

/opt/data/RedHat/RHEL6/

Inside the above directory is the below structure.

|-- RHEL6
|   |-- i386
|   |   |-- iso
|   |   `-- os
|   `-- x86_64
|       |-- iso
|       `-- os
|-- RHEL6-U1
|   |-- i386
|   |   |-- iso
|   |   `-- os
|   `-- x86_64
|       |-- iso
|       `-- os
|-- RHEL6-U2
|   |-- i386
|   |   |-- iso
|   |   `-- os
|   `-- x86_64
|       |-- iso
|       `-- os
`-- RHEL6-U3
    |-- i386
    `-- x86_64
        |-- iso
        |-- os
        `-- updates

Notes:
1) i386 directories are only maintained once a project requests a i386 install. Otherwise all systems are installed with 64bit operating system.
2) Updates tree is only maintained against the most recent release. These updates are individual updates that are collected and deployed, it is not an rsync against a public repo on the internet, as it would have a signicant storage cost to this.
3) There is nothing special about the x86_64/os directory, it is just a copy of the contents of the .iso file. There is already yum repository configuration contained within.

Once the files have been copied the NFS export must be defined. Edit the file /etc/exports with a line like this:

/opt/data   *(rw,no_root_squash)

Set NFS to autostart when the system boots:

chkconfig nfs on

Once the export is defined and NFS is running, now its time to create the PXE operating system and client IP address profiles. The necessary commands are part of the package 'system-config-netboot-cmd'. To create an OS profile, a command such as this is used:

pxeos -a -p NFS -D 0 -s 10.6.246.21 -L /opt/data/RedHat/RHEL6/RHEL6-U3/x86_64/os RHEL6-U3-x86_64

We are specifying:

-a - add
-p protocol we are using
-D diskless yes or no... we specify 0
-s server we download from via our specific protocol
-L location on server
name of profile.  this is cosmetic, and can say anything you like.

To check that the OS profile was sucessfully created, or to see existing OS profiles, use the pxeos command, with -l:

[root@psl-labnws01 RHEL6]# pxeos -l
RHEL6-U3-x86_64
        Description:
        Protocol:       NFS
        isDiskless:     False
        Server:         10.6.246.21
        Location:       /opt/data/RedHat/RHEL6/RHEL6-U3/x86_64/os

NFS is critical to the above, so if there will be an error on submission if the NFS export and file path is not correct.

To create a profile for target IP address, use this command:

pxeboot -a -O RHEL6-U3-x86_64 -r 30512 -K http://psl-labnws/psl-qaclmd01.cfg 10.6.244.91

We are specifying:

-a add
-O the selected profile name, from above (see pxeos -l)
-r amount of ram to use for the anaconda environment during boot
-K http location of kickstart file
ip address of target system

Notes:
1) The bootstrapper we feed the process with the -O switch, we are expecting to fine the same operating system configured inside the .cfg file that we download from -K. if there is a mismatch, installation will fail.
2) The http can be shortname only as long as DHCP server is configured with 'option domain-name' in the segment for the targets VLAN. Otherwise, FQDN or ip address of the http server may be substituted.
3) For sanity, the name of the .cfg file should be a representation of the target system's hostname. Note, that the hostname configured inside the .cfg does not have to be the same as the name of the .cfg file.
4) The configured IP address in the example (10.6.244.91) will always download RHEL6-U3-x86_64 bootstrapper and psl-qaclmd01.cfg, each time it boots with "boot to PXE" enabled.

Things to *ALWAYS* consider:
1) the DHCP statement for a server causes the "boot to PXE" functionality to always be available at boot. if the DHCP statement is erased, F12 at boot will not successfully cause an operating system install.
2) the server is looking for a PXE client on the configured IP address. if you build server 1, then shut it down, and reconfigured the DHCP statement with a differnet mac address for another server, that other server will boot up and behave as the first, taking the IP, pulling down the install. there is no logging facility to keep track of what IP what MAC was most recently used. if the IP comes up and asks for PXE, if the IP is known and configured, it gets PXE.

THIS IS THE MOST CRITICAL TO CONSIDER:
3) if a server is up and happy and configured and perfect, if you reboot and hit F12, if the above items are configured, PXE install will happen. there is no "yes/no/really/are you sure"? PXE will completely erase and reinstall a system if the IP address is known to the PXE configuration.

HTTP Server

A web server is used to feed the anaconda kickstart files to the PXE client. Currently they are served via apche, but they can be on any web server, IIS, lighthttpd, anything, as long as the file can be reached. It also can be any server, there is nothing binding that decres you must use your PXE server, but we obviously do it for administrative convenience.

Configuration files are stored in /var/www/html.

Anaconda Configuration File

The .cfg files contain everything the installer needs to build the system. The syntax can vary from RHEL5 to RHEL6 systems. RHEL6 syntax can vary from RHEL6 to RHEL6-U1 and above, and can be dependant on the target hardware its being installed (this refers to the type of network interface and its name designation).

Typical RHEL5 Config File
Typical RHEL6 Config File

DNS Entries

DNS entries for each PXE Service Provider are required in order to use shortnames in the configuration files. the configuration file will refer to both HTTP and NFS destinations. If DNS entires do not exist, then IP address and FQDNs may be substituted, but for consistancy sake DNS entries are alwyas maintained for all PXE servers. Properly configured DNS entries should return values as such:

[jhorne@pd-2njg6j1FC17 ~]$ host es-nws01
es-nws01.emcent..tech.net has address 10.88.254.22
[jhorne@pd-2njg6j1FC17 ~]$ host 10.88.254.22
22.254.88.10.in-addr.arpa domain name pointer es-nws01.emcent..tech.net.

If similar replies are not recieved from DNS infrastructure, make necessary changes because this will simplify all operations (not just PXE).

VLAN dhcp-helper address

Each VLAN must have dhcp-helper configured or else DHCP requests will never make it to the DHCP server. Below is an example configuration, this is configured by the network team. If ever DHCP requests are expected, but they are not making it to the server, after DHCP service has been verified operational, DHCP helper should be verified on the VLAN:

interface Vlan500
 ip address 10.50.0.1 255.255.255.0
 ip helper-address 10.6.246.21
!

Switchport VLAN Configuration

Each switch port that a server plugs into must be configured for the correct target vlan, and must have the feature 'spaning-tree portfast' enabled. Without this feature, PXE boot process will not work because the port does not come up quickly enough. Without it, by the time the switch brings the port up, the PXE process has already timed out and failed. Spanning-tree portfast brings the port up quickly and skips all the checks the switch would normally do. Spanning-tree portfast would only be used on a port that a server is plugged into, not a port that connects to another switch (thus, what the switch would be normally checking for).

interface GigabitEthernet6/40
 switchport
 switchport access vlan 244
 switchport mode access
 spanning-tree portfast
!

In the above example, the vlan configured is 244. When the PXE client boots and requests DHCP, this will be logged:

Aug 20 12:01:14 psl-labnws01 dhcpd: DHCPDISCOVER from 18:03:73:0b:c6:95 via 10.6.244.1: network 10.6.244/23: no free leases

Since the port is configured for VLAN 244, it comes to the DHCP server via the 10.6.244.1 gateway. If the host is configured in the 10.6.246.0/24 segment of the /etc/dhcpd.conf file, we would see the above message. The client is unknown, because the switch is putting the system on VLAN 244, and we want VLAN 246. If the client is known in /etc/dhcpd.conf, but pxeos and .cfg file have wrong vlan, PXE install will boot strap, but will be unable to download the .cfg file. The VLAN configured on the individual switch port must match the intended network specified in the PXE, DHCP, and .cfg file. They are all 3 independant tools, but must have syncronized configurations.