Differences between revisions 9 and 10
Revision 9 as of 2011-10-14 09:52:25
Size: 6819
Comment:
Revision 10 as of 2011-10-14 10:09:08
Size: 7045
Comment:
Deletions are marked like this. Additions are marked like this.
Line 67: Line 67:
We have seen a few cases where the network becomes unreliable without the ``updelay`` parameter. Switch forward delay is related to the Spanning Tree Protocol (if it's configured), see
`Spanning Tree Protocol Timers <http://www.cisco.com/en/US/tech/tk389/tk621/technologies_tech_note09186a0080094954.shtml>`_::

      forward delay: The forward delay is the time that is spent in the listening and learning state.

Using Multiple Ethernet Cards

Linux port bonding

Some machines, especially servers, are equipped with dual Ethernet ports on the motherboard. In order to use both ports for increased bandwidth and/or redundancy, Linux must be configured appropriately.

You should consult this very nice overview of the Linux bonding driver and the Linux Ethernet Bonding Driver HOWTO. The kernel-doc RPM also documents port bonding in the file /usr/share/doc/kernel-doc-*/Documentation/networking/bonding.txt or in http://www.kernel.org/doc/Documentation/networking/bonding.txt.

For CentOS5 Linux this is documented in 14.2.3 Channel_Bonding_Interfaces.

Loading the bonding kernel module

Read the Channel_Bonding_Interfaces manual and bonding_Module_Directives for the parameter values. Apparently it is preferred to enter bonding parameters in the file /etc/sysconfig/network-scripts/ifcfg-bond0.

For RHEL6 read Using Channel Bonding.

Our current instructions are: Add this line to /etc/modprobe.conf (not /etc/modules.conf):

alias bond0 bonding
options bond0 mode=6 miimon=100 updelay=200

The mode=6 refers to:

Sets an Active Load Balancing (ALB) policy for fault tolerance and load balancing.
Includes transmit and receive load balancing for IPV4 traffic.
Receive load balancing is achieved through ARP negotiation.

The miimon=100 refers to:

Specifies the MII link monitoring frequency in milliseconds.
This determines how often the link state of each slave is
inspected for link failures.  A value of zero disables MII
link monitoring.  A value of 100 is a good starting point.
The use_carrier option, below, affects how the link state is
determined.  See the High Availability section for additional
information.  The default value is 0.

The updelay=200 refers to:

Specifies the time, in milliseconds, to wait before enabling a
slave after a link recovery has been detected.  This option is
only valid for the miimon link monitor.  The updelay value
should be a multiple of the miimon value; if not, it will be
rounded down to the nearest multiple.  The default value is 0.

If you do not set the updelay parameter, the syslog may show this warning:

kernel: bonding: In ALB mode you might experience client disconnections upon reconnection of a link if the bonding module updelay parameter (0 msec) is incompatible with the forwarding delay time of the switch

Switch forward delay is related to the Spanning Tree Protocol (if it's configured), see Spanning Tree Protocol Timers:

forward delay: The forward delay is the time that is spent in the listening and learning state.

Modifying network scripts

In /etc/sysconfig/network-scripts/ new script files should be created:

  1. Create a new bonding device script file ifcfg-bond0 containing:

    DEVICE=bond0
    BOOTPROTO=dhcp
    ONBOOT=yes
    USERCTL=no
  2. The normal Ethernet interface scripts ifcfg-ethN should turn eth0 and eth1 into slave devices:

    DEVICE=eth0
    ONBOOT=yes
    BOOTPROTO=dhcp
    MASTER=bond0
    SLAVE=yes
    USERCTL=no
and similarly for eth1.

When using systemimager to clone the nodes these steps can be performed automatically using post-install scripts, e.g., /var/lib/systemimager/scripts/post-install/20q.eth_bonding_config script for the step 2.:

#!/bin/sh

# Get the Systemimager variables
. /tmp/post-install/variables.txt

# Name of the central server on this network
SERVER=audhumbla1
DOMAINNAME=dcsc.fysik.dtu.dk

# Correct the SystemImager eth0 config, turning eth0 into an Ethernet bonding device (bond0=eth0+eth1)
cp -p /etc/sysconfig/network-scripts/ifcfg-eth0 /tmp/ifcfg-eth0.BAK
cat <<EOF > /etc/sysconfig/network-scripts/ifcfg-eth0
DEVICE=eth0
ONBOOT=yes
BOOTPROTO=dhcp
MASTER=bond0
SLAVE=yes
USERCTL=no
EOF

# Finished
cd

Restart network services

At this stage the network should be restarted by service network restart, or the system should be rebooted, in order to activate the bond0 device in stead of the normal eth0 device.

Port bonding troubleshooting

No DHCP response for the bond0 device

If you've set up the bond0 device for DHCP by BOOTPROTO=dhcp and you don't get a DHCP response from the server, then it may be because bond0 uses the first Ethernet device (usually eth0) for DHCP. If your DHCP server is configured with the Ethernet MAC-address of another device (for example, eth1), then DHCP will fail.

This scenario happens when the Linux kernel has swapped around the Ethernet devices eth0 and eth1 opposite to what the hardware thinks. Check this by:

ifconfig -a

to see the MAC-addresses of the network interfaces.

SystemImager can correct this problem by explicit naming of network interfaces as described in the Troubleshooting section A possible solution to fix network interface naming.

You learn the PCI device names and their MAC-addresses by, for example:

udevinfo -a -p /sys/class/net/eth0

and then add appropriate configuration lines to the file /etc/udev/rules.d/60-net.rules.

To implement this we have made a SystemImager post-install script for the SL2x170zG6 nodes in /var/lib/systemimager/scripts/post-install/15d.eth_device_names with the essential content:

DEVICE_RULES=/etc/udev/rules.d/60-net.rules
TEMPNAME=/tmp/eth_names
# Create PCI device name to ethX names for HP SL2x170zG6:
cat <<EOF > $TEMPNAME
ACTION=="add", SUBSYSTEM=="net", BUS=="pci", ID=="0000:05:00.1", NAME="eth0"
ACTION=="add", SUBSYSTEM=="net", BUS=="pci", ID=="0000:05:00.0", NAME="eth1"
EOF
# Append original device rules
cat $DEVICE_RULES >> $TEMPNAME
# Write new device rules file (with backup)
cp -p $DEVICE_RULES $DEVICE_RULES.orig
cp $TEMPNAME $DEVICE_RULES

The PCI device addresses 0000:05:00.x will vary depending on the hardware.

Niflheim: MultipleEthernetCards (last edited 2017-01-06 09:15:53 by OleHolmNielsen)