Recompiling kernel on VanGogh cluster, and patching with web100 modification
6 may 2003, Freek Dijkstra
The installation is done in three parts:
- Installing a new clean 2.4.26 kernel on vangogh0
- Patching the kernel and installing it on vangogh0
- Copying the kernel to the other nodes
A: Installing a new clean 2.x kernel on vangogh0
(Step 1) GRAB SOURCE
This step compiles a kernel from scratch, using externel sources. For Red Hat or Debian, you may prefer to download the sources using a specific rpm of deb package. See earlier howto's for that information.
(Variant A) Download fresh sources
# wget
http://www.kernel.org/pub/linux/kernel/v2.4/linux-2.4.26.tar.bz2∞
# bunzip2 linux-2.4.26.tar.bz2
# tar xf linux-2.4.26.tar
(Variant B) Use Red Hat:
On Red Hat, we need the kernel-source package:
$ rpm -q kernel-source
package kernel-source is not installed
[Insert
RedHat 7.3, CD 2]
# mount /dev/cdrom /mnt/cdrom
$ uname -r
2.4.18-3
[let's us the same kernel version]
# rpm -Uvh /mnt/cdrom/
RedHat/RPMS/kernel-source-2.4.18-3.i386.rpm
[note: recommended package kernel-headers not found]
(Variant C) Use Debian:
You may want to consider installing a precompiled kernel, instead of compiling it yourself, in particular if you are not going to make any patch.
not written. go to http://packages.debian.org∞ to find sources.
(All Variants) Make sure the linux symbolic link points to the new source:
rm linux
ln -s linux-2.4.26 linux
ls -l
lrwxrwxrwx 1 root root 12 May 7 12:55 linux -> linux-2.4.26
drwxr-xr-x 15 573 573 4096 Apr 14 15:05 linux-2.4.26
[...]
(Step 2) VERIFY THE PATCH WORKS
If you are going to patch the source, it is wise to first go through the whole procedure once without making the patch, so you know possible problems are not related to the patch. However, before you do that, make sure that the patch you have is indeed for the source you are working with.
Download patch:
cd /usr/src
wget http://www.web100.org/download/kernel/current/web100-2.3.7-200404151744.tgz
tar xzf web100-2.3.7-200404151744.tgz
Verify if patch works:
cd linux-2.4.26
patch -p1 < ../web100/web100-2.4.26-2.3.7-200404151744.patch > web100.patch.out 2> web100.patch.err
cat web100.patch.*
No problems should be found. Patch works flawlessly.
Now, revert the patch. Either using patch -r
patch -Rp3 < ../web100/web100-2.4.26-2.3.7-200404151744.patch > web100.patch.out 2> web100.patch.err
Or by simply unpacking the sources again (see Step 1).
For example, for a downloaded tar file:
cd /usr/src
rm -rf linux-2.4.26
tar xf linux-2.4.26.tar
(Step 3) BACKUP THE CURRENT CONFIGURATION
If /boot is empty, it is most likely that it's contents is on a seperate partition which was automatically unmounted after the booting was finished. Use mount, for example "mount /dev/sda1 /boot", to mount the disk. (You can usually find the correct device in /etc/mtab or /etc/fstab)
Make sure the current kernel configuration is copied in /boot/. Here it is:
-rw-r--r-- 1 root root 44361 Mar 13 2003 /boot/config-2.4.20-8smp
If it is not, make a copy:
cd /usr/src/linux
uname -r
2.4.20-8smp
cp /usr/local/linux-2.4.20-8smp/.config /boot/config-2.4.20-8smp (or whatever uname -r gave for you)
(Step 4) CLEAN CURRENT INSTALL / CONFIG
This is not needed if the kernel was downloaded from scratch, and is only necessary if you alter an existing kernel.
However, it does not hurt, so here it is for sake of completeness:
cd /usr/src/linux
make clean
make mrproper
make clean (yes, again)
(Step 5) CONFIGURE THE NEW KERNEL
For ease, you can start with an old configuration file (you did save it at step 2, did you?)
cp /boot/config-2.4.20-8smp .config
Now you can choose which of the tools you want to use to make changes to the .config file:
(a) On X:
startx (if you haven't started X already)
make xconfig
(b) Curses:
export TERM=xterm
make menuconfig
In the config it is important to check CPU type, SMP support (for dual processors), and enable loadable kernel support. Be sure to check these, especially if you are starting from scratch
Note: If you are going to build a complete module from scratch, make very sure that you build drivers for all your hardware as is listed in /etc/modules.conf. Otherwise, you will fail at step 9. If you are creating a derivate, based on your current system, there are most probably easy workarounds at step 9, because you have already compiled the modules earlier.
Optionally, you can manually edit the .config file by hand. For example, to set a parameter like CONFIG_NET_E1000=m (for the Intel 1Gbps network card)
(Step 6) MAKE DEPENDENCIES
(not needed for 2.6 kernels)
(Step 7) GIVE NEW KERNEL AN UNIQUE NAME
Edit Makefile
cd /usr/src/linux
vi Makefile (or pico -w Makefile if vi is not your favourite editor)
Edit the line starting with
EXTRAVERSION, and change it to something unique. For example
EXTRAVERSION = -3.plain.20031015
(Step 8) MAKE
Note: you can simultaneous do the following two steps. Both will take a long time, so plan your lunch / night sleep / long break here. You can also combine the two commands into one: # make bzImage modules
Alternatively, if you want to keep the stdout:
nohup make bzImage &
tail -f nohup.out (monitor progress)
Alternatively, if you want to keep the output:
nohup make modules 1> modules.out 2> modules.err &
Verify if the bzImage was correctly created
ls -l /usr/src/linux/arch/i386/boot/bzImage
Inspect the outputs of both make's, and check if there have been any errors.
If there have been, stop and optionally start over.
(Step 9) INSTALL KERNEL AND MODULES
Install modules in /lib/modules (only if you enabled loadable module support, which you probably did in step 4):
Now you must copy the newly created kernel to the /boot/ directory. First check the current naming convention. It is either bzImage.plain.20031015 or vmlinuz-2.4.18-3plain.20031015 (starting with bzImage or vmlinuz)
If /boot/ is empty, it is probably a seperate partition which is not mounted. mount it first. See note at Step 3.
Install the kernel
cp /usr/src/linux/arch/i386/boot/bzImage /boot/vmlinuz-2.4.18-3plain.20031015 (Use the same naming convention as you did in Step 6)
Copy the config file (this is technically not necessary, but for organisational purposes it is a must):
cp .config /boot/configure-2.4.18-3plain.20031015 (Use the same naming convention as you did in Step 6)
Copy the System map file (this is required for some libraries):
cp System.map /boot/System.map-2.4.18-3plain.20031015 (Use the same naming convention as you did in Step 6)
ln -s /boot/System.map-2.4.26-web100 /boot/System.map
(Step 10) CREATING AN INITIAL RAMDISK
This step is optional for some systems. However, it is highly recommended that you try it anyway, since you will detect missing modules in an early stage.
Some systems do NOT need an initial RAMdisk. For example, when using EFI and ELILO current installations do not use a ramdisk (though they might in the future). Best is to check in the bootloader config (
/etc/lilo.conf,
/etc/grub.conf or
/etc/elilo.conf) for "
init=" lines, or for initrd- or .img files in the /boot/ directory. If there are, you probably do need a initial ramdisk.
You can typically check whether this is the case if
cd /usr/src/linux
/sbin/mkinitrd initrd-2.4.18-3.plain.20031015.img 2.4.18-3.plain.20031015 (Use the same naming convention as you did in Step 6; the second name technically refers to the directory in /lib/modules/, but you should just keep it the same)
/usr/sbin/mkinitrd -o initrd-2.4.18-3.plain.20031015.img 2.4.18-3.plain.20031015 (Use the same naming convention as you did in Step 6; the second name technically refers to the directory in /lib/modules/, but you should just keep it the same)
This step may fail, typically with a message like:
No module aic7xxx found for kernel 2.4.18-3.plain.20031015
Background: Mkinitrd makes a Initial Ramdisk (initrd) which is just a very small kernel with support for some drivers like SCSI and RAID. It is typically used in either as a first boot to access a real kernel residing on a SCSI or RAID device (it can just mount / and then kick off init), or it is used to create a boot floppy.
mkinitrd looks at /etc/modules.conf to see which modules are necessary for a mimimum. It will look into /lib/modules/ for the appropriate modules, and combine these into the initrd ramdisk.
If you get the error above, you should restart at step 4.
Either you can use xconfig, making sure that all modules are compiled as module this time. For example, you can find aic7xxx in SCSI Support->SCSI Low-level Drivers->Adaptec AIC7xxx support
Otherwise, you can manually edit the .config file, and find the required module, for example CONFIG_SCSI_AIC7XXX, and set the variable to m.
You can take the shortcut, leaving the kernel untouched, and only doing make modules and make modules_install before retrying step 9 again.
If all succeeded, copy the ramdisk:
cp initrd-2.4.18-3.plain.20031015.img /boot/
(Step 11) CONFIGURING THE BOOTLOADER
LILO is a boot loader for i386 machines. Alternatively you can use the newer GRUB, which has a few advantages over LILO. For non-pentium machines, you need another bootloader anyway (for example, on the ppc, you can use yaboot, quik,
BootX or
OpenFirmware directly). It is very advisable to use the same bootloader as is currently already in use.
(Variant A) For LILO:
Edit
/etc/lilo.conf
And add a section, like this:
image=/boot/vmlinuz-2.4.18-3.plain.20031015
label=linux-2.4.18-3.plain
initrd=/boot/initrd-2.4.18-3.plain.20031015.img
read-only
root=/dev/sda2
(the initrd is only needed if you made an initial ramdisk)
Note that you MUST run the lilo command every time you create a new image, even if you don't edit lilo.conf! Otherwise, you will run into serious problems (lilo will store the location of the image file on the hard disk, and uses that at boot time).
(Variant B) For GRUB:
Edit
/boot/grub/menu.lst or
/boot/grub/grub.conf
And add a section, like this:
title Red Hat Linux-up (2.4.26-smp)
root (hd0,0)
kernel /vmlinuz-2.4.26-smp ro root=LABEL=/
initrd /initrd-2.4.26-smp.img
(the initrd is only needed if you made an initial ramdisk)
Adjust the default, and optionally the fallback lines as well.
(Check the boot device, typically
/dev/sda or
/dev/hda. Use
df)
/sbin/grub-install /dev/sda
(Variant C) For ELILO:
For Elilo, as currently configured on the theo Itanium, the procedure is not ideal, but manageable:
First you do not need an initial ramdisk (though this is likely to change in forthcoming releases, be sure to check).
Second, you need to copy the vmlinuz files out of the /boot partition:
Make sure /boot is mounted
Copy (or move, but copy is safer) all files from /boot/ to /:
cp /boot/vmlinuz* /
cp /boot/System.map* /
Now, you need to unmount /boot to run elilo.
Edit
/etc/elilo.conf
(do not edit /boot/efi/debian/elilo.conf; this file will be overwritten later on!)
And add a section, like this:
image=/vmlinuz-2.4.22-web100
label=2.4.22-web100
root=/dev/sda3
append="console=tty console=ttyS0,9600"
read-only
Adjust the default line as well.
Run elilo to copy the kernel (and other files) from /vmlinuz-2.4.22-web100 to /boot/efi/debian/.
This should mount /boot, copy the files, and unmount /boot again.
(Step 12) REBOOT THE MACHINE
TROUBLESHOOTING
ds: no socket drivers loaded!
kmod: failed to exec /sbin/modprobe -s -k block-major-8, errno=2
VFS: Cannot open root device "802" or 08:02
Please append a correct "root=" boot option
Kernel panic: VFS: Unable to mount root fs on 08:02
this means that the initrd kernel can't access the root system. You probably need to add more modules to your initrd (Step 10).
Typically, your ramdisk does not contain SCSI support. Check if aic7xxx is listed in /etc/modules.conf
B: Installing a patched 2.4.26-web100 kernel on vangogh0
Go through the above steps, only with a patched kernel:
1. Grab source (download variant)
cd /usr/src
mv linux-2.4.26 linux-2.4.26-orig
tar xf linux-2.4.26.tar
mv linux-2.4.26 linux-2.4.26-web100
mv linux-2.4.26-orig linux-2.4.26
rm linux
ln -s linux-2.4.26-web100 linux
2. Apply the patch
cd linux-2.4.26-web100
patch -p3 < ../web100/web100-2.4.26-2.3.7-200404151744.patch > web100.patch.out 2> web100.patch.err
cat web100.patch.*
3. Backup the current configuration
already did so
4. Clean current install / config
make clean
make mrproper
make clean (yes, again)
5. Configure the new kernel
Just use the same configuration:
cp ../linux-2.4.26/.config ./
Change options as recommended in the
Web100 kernel INSTALL readme file.
Code maturity level options
>
[*] Prompt for development and/or incomplete code/drivers [was already set]
Processor type and features
>
[*] Symmetric multi-processing support (Select for SMPs only) [was already set]
Networking options
>
[*]
Web100 networking enhancements (NEW)
[*]
Web100: TCP statistics (NEW)
(0666)
Web100: Default file permissions
(0)
Web100: Default gid
[*]
Web100:
Net100 extensions (NEW)
[*]
Web100: netlink event notification service (NEW)
[ ]
Web100: fair sharing of socket memory (NEW) [left out, because undocumented]
(7)
Web100: default winscale initial value (NEW)
6. Make dependencies
7. Give new kernel an unique name
Edit Makefile
8. Make
nohup make bzImage modules 1> make.out 2> make.err &
Check logfiles: no errors?
9. Install kernel and modules
make modules_install
cp arch/i386/boot/bzImage /boot/vmlinuz-2.4.26-web100
cp .config /boot/configure-2.4.26-web100
cp System.map /boot/System.map-2.4.26-web100
ln -s /boot/System.map-2.4.26-web100 /boot/System.map
10. Create an initial ramdisk
/sbin/mkinitrd initrd-2.4.26-web100.img 2.4.26-web100
cp initrd-2.4.26-web100.img /boot/
11. Configure the bootloader (grub variant)
pico -w /boot/grub/grub.conf
Add section:
default=3
title Red Hat Linux - SMP web100 (2.4.26-web100)
root (hd0,0)
kernel /vmlinuz-2.4.26-web100 ro root=LABEL=/
initrd /initrd-2.4.26-web100.img
/sbin/grub-install /dev/sda
12. Reboot
C: Copying the kernel to the other nodes
(Step 1) CONNECT TO DESTINATION NODE
# ssh root@vangogh1
(Step 2) COPY FILES
cd /boot/
scp -p "root@vangogh0:/boot/*2.4.26-web100*" /boot/
Should copy:
configure-2.4.26-web100
initrd-2.4.26-web100.img
vmlinuz-2.4.26-web100
Note: the command bellow will follow symbolic links. In particular,
the build directory, which can be rather hugh (> 100 MB). As an
alternative, you may want to use tar to copy the files, which does
not traverse symbolic links.
cd /lib/modules/
scp -pr root@vangogh0:/lib/modules/2.4.26-web100 /lib/modules/
Should copy a lot of files.
(Step 3) CONFIGURE THE BOOTLOADER
Note: The vangogh nodes used LILO. This has been GRUB.
(Variant A) For GRUB:
Edit /boot/grub/grub.conf
And add a section, like this:
title Red Hat Linux-up (2.4.26-web100)
root (hd0,0)
kernel /vmlinuz-2.4.26-web100 ro root=/dev/sda2
initrd /initrd-2.4.26-web100.img
Adjust the default, and optionally the fallback lines as well.
(Check the boot device, typically /dev/sda or /dev/hda. Use df)
/sbin/grub-install /dev/sda
(Step 4) REBOOT THE MACHINE
TROUBLESHOOTING
Mounting /proc filesystem
Creating block devices
Creating root devices
mkrootdev: label / not found
Mounting root filesystem
mount: error 2 mounting ext3
pivotroot: pivot_root(/sysroot, /sysroot/initrd) failed: 2
umount /initrd/proc failed: 2
freeing unused kernel memory: 133k freed
Kernel panic: No init found. Try passing init= option to kernel
The "label / not found" error indicates that no hard disk with the label "/" could be found. Apparently, the hard disk does not have that meta-information in it's boot block. The easiest work-around is to specify the device instead of the label as startup parameter.
Thus, in grub.conf (or something simular in lilo.conf), change the line:
kernel /vmlinuz-2.4.26-web100 ro root=LABEL=/
to the line:
kernel /vmlinuz-2.4.26-web100 ro root=/dev/sda2
There are no comments on this page. [Add comment]