research:projects:ipcc-l:howto

HowTo

The following guides are valid for CentOS 7.

Lustre

Build (state 04.12.2017)

HowTo build a one-node Lustre file system with ZFS backend:

  1. Prepare System
    1. Install the kernel development tools
      yum -y groupinstall 'Development Tools'
      yum -y install epel-release
    2. Install additional dependencies
      yum -y install xmlto asciidoc elfutils-libelf-devel zlib-devel binutils-devel newt-devel python-devel hmaccalc perl-ExtUtils-Embed bison elfutils-devel  audit-libs-devel python-docutils sg3_utils expect attr lsof quilt libselinux-devel kernel-devel libyaml-devel
    3. Disable SELinux for older clients
      sed -i '/^SELINUX=/s/.*/SELINUX=disabled/' /etc/selinux/config 
  2. Prepare ZFS backend (follow the guide for packaged ZFS or go to this section for custom ZFS build)
    1. EPEL release
      URL='http://download.zfsonlinux.org'
      yum -y install --nogpgcheck $URL/epel/zfs-release.el7.noarch.rpm 
    2. For the newest Lustre releases
      • Change /etc/yum.repos.d/zfs.repo to switch from dkms to kmod (more info here and here)
         [zfs]
         name=ZFS on Linux for EL 7 - dkms
         baseurl=http://download.zfsonlinux.org/epel/7/$basearch/
        -enabled=1
        +enabled=0
         metadata_expire=7d
         gpgcheck=1
         gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-zfsonlinux
        @@ -9,7 +9,7 @@
         [zfs-kmod]
         name=ZFS on Linux for EL 7 - kmod
         baseurl=http://download.zfsonlinux.org/epel/7/kmod/$basearch/
        -enabled=0
        +enabled=1
         metadata_expire=7d
         gpgcheck=1
         gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-zfsonlinux
    3. Install ZFS and its associated SPL packages
      • kmod packages for newer releases
        yum install -y zfs libzfs2-devel kmod-spl-devel kmod-zfs-devel 
      • dkms packages for older releases
        yum install -y zfs libzfs2-devel zfs-dkms 
  3. Build Lustre
    1. Get Lustre source code
      git clone git://git.hpdd.intel.com/fs/lustre-release.git
    2. Configure (--disable-ldiskfs for ZFS backend, --disable-server for client only)
      cd lustre-release/
      sh ./autogen.sh
      ./configure --disable-ldiskfs
    3. Make and install rpms
      make rpms
      yum -y install *.$(arch).rpm
  4. You may need to reboot and to explicitly load the ZFS and Lustre modules
    reboot
    modprobe zfs
    modprobe lustre
  5. Format targets (change /tmp in this example to real devices or partitions)
    mkfs.lustre --mgs --backfstype=zfs --fsname=lustre --device-size=1048576 lustre-mgs/mgs /tmp/lustre-mgs
    mkfs.lustre --mdt --backfstype=zfs --fsname=lustre --index=0 --mgsnode=$(hostname)@tcp --device-size=1048576 lustre-mdt0/mdt0 /tmp/lustre-mdt0
    mkfs.lustre --ost --backfstype=zfs --fsname=lustre --index=0 --mgsnode=$(hostname)@tcp --device-size=1048576 lustre-ost0/ost0 /tmp/lustre-ost0
    1. Change /etc/ldev.conf
      hostname - mgs     zfs:lustre-mgs/mgs
      hostname - mdt0    zfs:lustre-mdt0/mdt0
      hostname - ost0    zfs:lustre-ost0/ost0
  6. Run Lustre
    1. Reconfigure the firewall to allow incoming connections on TCP port 988 (for socklnd only), or temporarily disable it
      systemctl stop firewalld
      systemctl disable firewalld 
    2. Start servers
      systemctl start lustre
    3. Mount client
      mkdir /mnt/lustre/client
      mount -t lustre $(hostname):/lustre /mnt/lustre/client

ZFS

HowTo build a custom ZFS:

  1. Prepare System (Example CentOS7)
    1. Disable SELinux
      sed -i '/^SELINUX=/s/.*/SELINUX=disabled/' /etc/selinux/config 
    2. Install the kernel development tools
      yum -y groupinstall 'Development Tools'
      yum -y install epel-release
    3. Install additional dependencies
      yum -y install parted lsscsi wget ksh
      yum -y install kernel-devel zlib-devel libattr-devel libuuid-devel libblkid-devel libselinux-devel libudev-devel
      yum -y install device-mapper-devel openssl-devel
  2. Clone both Git-repositories if you need 0.7 release or older (for newer SPL was merged in to the ZFS repository). See which ZFS version you need for Lustre in Lustre Support Matrix.
    git clone https://github.com/zfsonlinux/spl.git
    git clone https://github.com/zfsonlinux/zfs.git
  3. Perform all the following steps for both directories (complete spl first)
    1. Configure for specific system
      cd <spl|zfs>
      ./autogen.sh
      ./configure --with-spec=redhat
    2. Build RPMs in both directories
      • kmod
        make pkg-utils pkg-kmod
    3. Install RPMs
      yum localinstall *.$(arch).rpm

Debugging

Print

The easiest way to print an error message in kernel is the common printk function with KERN_ERR level.

Lustre

Lustre defines several macros for printing out messages. The most usefull ones are CDEBUG and CERROR.

CDEBUG Prints message in debug log.
CDEBUG(D_INFO, "Debug message: rc=%d\n", number); 
CERROR Prints message in debug log and to console.
CERROR("Something bad happened: rc=%d\n", rc); 

Show debug log with lctl debug_kernel. CERROR messages are always printed; CDEBUG messages are printed depeding on level parameter (D_INFO in the example) and debugging level set iwth lctl.

Get debugging level with

 lctl get_param debug 

Set debugging level with

 lctl set_param debug="+info" 

More information can be found in the Lustre documentation.

ZFS

Some ZFS source files like dmu.c are built for user and kernel space and would cause compile errors when using printk. The following if statement avoids errors.

#ifdef _KERNEL
        printk(KERN_ERR "Error message \n");
#endif

Crash

Crash is a tool for interactively analyzing the state of the Linux system while it is running, or after a kernel crash has occurred and a core dump has been created (crash(8)).

Enable debugging repo in /etc/yum.repos.d/CentOS-Debuginfo.repo

[base-debuginfo]
name=CentOS-7 - Debuginfo
baseurl=http://debuginfo.centos.org/7/$basearch/
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-Debug-7
-enabled=0
+enabled=1

Install kernel debugging symbols.

yum install kernel-debuginfo

Ensure the package version is the same like your running kernel's version. You may need to update everything.

Install crash.

yum install crash

Change the crashkernel option in /boot/grub2/grub.cfg to e.g. 128M (option auto works only for RAM>2G).

Reboot.

After your kernel crashed you should have the vmcore file (look in /var/crash/<ip_date>/vmcore). Run crash with the arguments vmlinux (look in /usr/lib/debug/lib/modules/<version>/vmlinux) and vmcore.

crash $path/vmlinux $path2/vmcore

crash 7.2.3-8.el7
...
      KERNEL: /usr/lib/debug/lib/modules/3.10.0-957.12.2.el7.x86_64/vmlinux
    DUMPFILE: /var/crash/127.0.0.1-2019-05-24-12:11:16/vmcore  [PARTIAL DUMP]
        CPUS: 4
        DATE: Fri May 24 10:11:11 2019
      UPTIME: 00:21:34
LOAD AVERAGE: 0.05, 0.07, 0.06
       TASKS: 173
    NODENAME: client0
     RELEASE: 3.10.0-957.12.2.el7.x86_64
     VERSION: #1 SMP Tue May 14 21:24:32 UTC 2019
     MACHINE: x86_64  (2208 Mhz)
      MEMORY: 2 GB
       PANIC: "BUG: unable to handle kernel NULL pointer dereference at 0000000000000030"
         PID: 11998
     COMMAND: "ptlrpcd_00_03"
        TASK: ffff893eba301040  [THREAD_INFO: ffff893ebb54c000]
         CPU: 1
       STATE: TASK_RUNNING (PANIC)

Backtrace

crash> bt
PID: 3981   TASK: ffff9997b4c6d140  CPU: 2   COMMAND: "ptlrpcd_00_02"
 #0 [ffff9997fa2876c0] machine_kexec at ffffffffb3e638e4
 #1 [ffff9997fa287720] __crash_kexec at ffffffffb3f1d0e2
...
#10 [ffff9997fa2879b0] async_page_fault at ffffffffb456c798
    [exception RIP: osc_build_rpc+1761]
    RIP: ffffffffc089ab01  RSP: ffff9997fa287a60  RFLAGS: 00010202
    RAX: ffff9997e25a5301  RBX: 0000000000000000  RCX: 0000000000000007
    RDX: 0000000000000006  RSI: ffff9997e25a5270  RDI: ffff9997b5b6d700
    RBP: ffff9997fa287b18   R8: 0000433600001f70   R9: ffffffffc089ab01
    R10: ffffdccdffd01f70  R11: ffffde45c1896940  R12: ffff9997ee2f8ec8
    R13: ffff9997f8ccf000  R14: 00000000fffffff4  R15: 0000000000400000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
#11 [ffff9997fa287b20] osc_io_unplug0 at ffffffffc08b5547 [osc]
...
#19 [ffff9997fa287f50] ret_from_fork_nospec_begin at ffffffffb4575c1d

Most notably here is RIP: ffffffffc089ab01. Your crash happened here. Get the corresponding kernel module.

crash> sym ffffffffc089ab01
ffffffffc089ab01 (t) osc_build_rpc+1761 [osc] 

Load the symbols of that kernel module.

crash> mod -s osc
     MODULE       NAME    SIZE  OBJECT FILE
ffffffffc08d73e0  osc   425532  /lib/modules/3.10.0-957.12.2.el7.x86_64/extra/kernel/fs/lustre/osc.ko 

Now translate the virtual address once again and see the function and line where your crash happened.

crash> sym ffffffffc089ab01
ffffffffc089ab01 (T) osc_build_rpc+1761 [osc] /root/lustre/lustre/osc/osc_request.c: 2993
research/projects/ipcc-l/howto.txt · Last modified: 2019-05-24 15:36 by Anna Fuchs