I need the collective wisdom of people who work with embedded devices.
Background:
I'm working on an alternate firmware for the Meraki MS220 switch. It's a MIPS-based device that originally ships with Redboot, a NOR-based kernel + embedded initrd that kexec's to a NAND-based kernel + embedded initrd. Very inelegant, and the tools to work with TSOP48 NAND chips are expensive and cumbersome compared to NOR.
I've made some decent progress in modifying the firmware. The CRC and size checks have been patched out of Redboot (binary patching, Cisco has provided the bootloader source code but no instructions on how to turn the compiled binary into something flashable), allowing me to boot the NAND kernel directly from Redboot. This is important because there are out of tree binary modules shipped with their firmware that are needed to manage the switching ASIC, and they're only provided for the kernel version that runs from NAND.
But I'm not here to talk about Redboot or Cisco's firmware shenanigans because I don't care about those.
Someone added support for the SoC to u-boot in 2018. u-boot 2019.10 compiles and runs on the switch (with networking, even). I figured out that I needed to add support in the kernel for reading the argc and argv from u-boot.
This is all done and working:
Why I need your collective wisdom
Serial output is broken when the kernel hands off to init/userspace. Looking from the kernel messages above, I know userspace is alive because I can see dropbear reading urandom and 125 seconds later, the kernel has finished initializing the nonblocking random pool. If userspace was dead, I'd have a kernel panic and the device would reset in 5 seconds.
My rootfs is minimal and built using buildroot. I know it works, because I compiled the kernel with TTY_PRINTK support, and created a start-up script in /etc/init.d/ to print out to /dev/ttyprintk to show that userspace is alive:
Buildroot by default uses busybox for init. Okay, so I thought maybe busybox has some bug and isn't outputting to /dev/ttyS0. So I configured buildroot to use OpenRC instead of busybox for init. Nope, still lose my serial output once the kernel invokes init.
The kernel command line is correct, serial output works for all of printk. I haven't gone gang-busters in modifying the command line since moving from Redboot to u-boot anyway, just some mtdpart changes that were necessary for u-boot/env.
If I boot using Redboot instead of u-boot (same kernel, just lacking the argc/argv patch), serial output from userspace is working:
But this. doesn't. make. sense. The bootloader shouldn't have anything to do with the handover between the kernel and init/userspace. Kernel printk works in both cases, but only in the case of u-boot, is serial output broken once the kernel invokes init. The load address of the kernel differs between Redboot (0x80100000) and u-boot (0x81000000), but if I was somehow messing up the memory map, I'd expect to see zero output from printk and (more likely) a kernel panic during boot.
I did configure getty and it is supposed to be listening on ttyS0, but since there's not even any serial output from init, getty isn't accessible:
I cannot figure out why I'm losing the serial console when the kernel invokes init. What am I missing here?
Here is the kernel .config
Background:
I'm working on an alternate firmware for the Meraki MS220 switch. It's a MIPS-based device that originally ships with Redboot, a NOR-based kernel + embedded initrd that kexec's to a NAND-based kernel + embedded initrd. Very inelegant, and the tools to work with TSOP48 NAND chips are expensive and cumbersome compared to NOR.
I've made some decent progress in modifying the firmware. The CRC and size checks have been patched out of Redboot (binary patching, Cisco has provided the bootloader source code but no instructions on how to turn the compiled binary into something flashable), allowing me to boot the NAND kernel directly from Redboot. This is important because there are out of tree binary modules shipped with their firmware that are needed to manage the switching ASIC, and they're only provided for the kernel version that runs from NAND.
But I'm not here to talk about Redboot or Cisco's firmware shenanigans because I don't care about those.
Someone added support for the SoC to u-boot in 2018. u-boot 2019.10 compiles and runs on the switch (with networking, even). I figured out that I needed to add support in the kernel for reading the argc and argv from u-boot.
This is all done and working:
## Booting kernel from Legacy Image at 81000000 ... Image Name: Linux 3.18.123 Image Type: MIPS Linux Kernel Image (uncompressed) Data Size: 2165421 Bytes = 2.1 MiB Load Address: 81000000 Entry Point: 81000000 Verifying Checksum ... OK Loading Kernel Image [ 0.000000] Linux version 3.18.123-meraki-elemental (hmartin@alp) (gcc version 5.4.0 (GCC) ) #42 Fri May 1 16:09:10 UTC 2020 [ 0.000000] bootconsole [early0] enabled [ 0.000000] CPU0 revision is: 02019654 (MIPS 24KEc) [ 0.000000] Determined physical RAM map: [ 0.000000] memory: 00477000 @ 00100000 (usable) [ 0.000000] memory: 00049000 @ 00577000 (usable after init) [ 0.000000] User-defined physical RAM map: [ 0.000000] memory: 07ff0000 @ 00000000 (usable) [ 0.000000] Zone ranges: [ 0.000000] Normal [mem 0x00000000-0x07feffff] [ 0.000000] Movable zone start for each node [ 0.000000] Early memory node ranges [ 0.000000] node 0: [mem 0x00000000-0x07feffff] [ 0.000000] Initmem setup node 0 [mem 0x00000000-0x07feffff] [ 0.000000] Reserving 0MB of memory at 0MB for crashkernel [ 0.000000] Primary instruction cache 32kB, VIPT, 4-way, linesize 32 bytes. [ 0.000000] Primary data cache 32kB, 4-way, VIPT, cache aliases, linesize 32 bytes [ 0.000000] Built 1 zonelists in Zone order, mobility grouping on. Total pages: 32496 [ 0.000000] Kernel command line: console=ttyS0,115200 mtdparts=m25p80:0x80000(uboot),0x40000(uboot-env),0x40000(uboot-conf),0x300000(kernel), 0x800000(squashfs),0x400000(jffs2) root=/dev/mtdblock5 rootfstype=squashfs mem=134152192 [ 0.000000] PID hash table entries: 512 (order: -1, 2048 bytes) [ 0.000000] Dentry cache hash table entries: 16384 (order: 4, 65536 bytes) [ 0.000000] Inode-cache hash table entries: 8192 (order: 3, 32768 bytes) [ 0.000000] Writing ErrCtl register=80000811 [ 0.000000] Readback ErrCtl register=80000811 [ 0.000000] Cache parity protection enabled [ 0.000000] Memory: 123832K/131008K available (3633K kernel code, 187K rwdata, 740K rodata, 292K init, 119K bss, 7176K reserved, 0K cma-reser ved) [ 0.000000] SLUB: HWalign=32, Order=0-3, MinObjects=0, CPUs=1, Nodes=1 [ 0.000000] NR_IRQS:66 [ 0.000000] sched_clock: 32 bits at 1kHz, resolution 1000000ns, wraps every 2147483648000000ns [ 0.002000] Calibrating delay loop... 275.45 BogoMIPS (lpj=137728) [ 0.013000] pid_max: default: 32768 minimum: 301 [ 0.014000] Mount-cache hash table entries: 1024 (order: 0, 4096 bytes) [ 0.015000] Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes) [ 0.018000] ftrace: allocating 11996 entries in 24 pages [ 0.046000] Performance counters: mips/24K PMU enabled, 2 32-bit counters available to each CPU, irq -1 (share with timer interrupt) [ 0.053000] devtmpfs: initialized [ 0.059000] NET: Registered protocol family 16 [ 0.094000] Switched to clocksource MIPS [ 0.133000] NET: Registered protocol family 2 [ 0.140000] TCP established hash table entries: 1024 (order: 0, 4096 bytes) [ 0.147000] TCP bind hash table entries: 1024 (order: 0, 4096 bytes) [ 0.153000] TCP: Hash tables configured (established 1024 bind 1024) [ 0.160000] TCP: reno registered [ 0.163000] UDP hash table entries: 256 (order: 0, 4096 bytes) [ 0.169000] UDP-Lite hash table entries: 256 (order: 0, 4096 bytes) [ 0.175000] NET: Registered protocol family 1 [ 0.183000] VCORE-III Watchdog Timer enabled (30 seconds). Prev boot was not caused by WDT reset. [ 0.194000] futex hash table entries: 256 (order: -1, 3072 bytes) [ 0.253000] squashfs: version 4.0 (2009/01/31) Phillip Lougher [ 0.259000] jffs2: version 2.2. (NAND) © 2001-2006 Red Hat, Inc. [ 0.266000] msgmni has been set to 241 [ 0.295000] io scheduler noop registered [ 0.299000] io scheduler deadline registered (default) [ 0.313000] Serial: 8250/16550 driver, 1 ports, IRQ sharing disabled [ 0.328000] console [ttyS0] disabled [ 0.331000] serial8250.0: ttyS0 at MMIO 0x70100000 (irq = 14, base_baud = 13020833) is a 16550A [ 0.340000] console [ttyS0] enabled [ 0.340000] console [ttyS0] enabled [ 0.347000] bootconsole [early0] disabled [ 0.347000] bootconsole [early0] disabled [ 0.364000] nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xf1 [ 0.371000] nand: Micron MT29F1G08ABADAWP [ 0.375000] nand: 128MiB, SLC, page size: 2048, OOB size: 64 [ 0.386000] Scanning device for bad blocks [ 0.510000] m25p80 spi0.1: found mx25l12805d, expected m25p80 [ 0.516000] m25p80 spi0.1: mx25l12805d (16384 Kbytes) [ 0.521000] 6 cmdlinepart partitions found on MTD device m25p80 [ 0.527000] Creating 6 MTD partitions on "m25p80": [ 0.532000] 0x000000000000-0x000000080000 : "uboot" [ 0.542000] 0x000000080000-0x0000000c0000 : "uboot-env" [ 0.551000] 0x0000000c0000-0x000000100000 : "uboot-conf" [ 0.562000] 0x000000100000-0x000000400000 : "kernel" [ 0.571000] 0x000000400000-0x000000c00000 : "squashfs" [ 0.583000] 0x000000c00000-0x000001000000 : "jffs2" [ 0.591000] tun: Universal TUN/TAP device driver, 1.6 [ 0.597000] tun: (C) 1999-2004 Max Krasnyansky <maxk@qualcomm.com> [ 0.608000] i2c /dev entries driver [ 0.614000] TCP: cubic registered [ 0.617000] Initializing XFRM netlink socket [ 0.624000] NET: Registered protocol family 10 [ 0.632000] NET: Registered protocol family 17 [ 0.632000] NET: Registered protocol family 17 [ 0.637000] NET: Registered protocol family 15 [ 0.641000] 8021q: 802.1Q VLAN Support v1.8 [ 0.646000] Meraki MS220-8 board detected [ 0.655000] i2c-gpio i2c-gpio.1: using pins 6 (SDA) and 5 (SCL) [ 0.669000] devtmpfs: mounted [ 0.706000] VFS: Mounted root (squashfs filesystem) readonly on device 31:5. [ 0.756000] devtmpfs: mounted [ 0.764000] Freeing unused kernel memory: 292K [ 4.191000] devpts: called with bogus options [ 5.654000] random: dropbear urandom read with 85 bits of entropy available [ 125.920000] random: nonblocking pool is initialized
Why I need your collective wisdom
Serial output is broken when the kernel hands off to init/userspace. Looking from the kernel messages above, I know userspace is alive because I can see dropbear reading urandom and 125 seconds later, the kernel has finished initializing the nonblocking random pool. If userspace was dead, I'd have a kernel panic and the device would reset in 5 seconds.
My rootfs is minimal and built using buildroot. I know it works, because I compiled the kernel with TTY_PRINTK support, and created a start-up script in /etc/init.d/ to print out to /dev/ttyprintk to show that userspace is alive:
[ 4.779000] S05printk starting; userspace is alive [ 5.689000] random: dropbear urandom read with 85 bits of entropy available [ 7.356000] Hello from S51printk; userspace is alive but serial is broken
Buildroot by default uses busybox for init. Okay, so I thought maybe busybox has some bug and isn't outputting to /dev/ttyS0. So I configured buildroot to use OpenRC instead of busybox for init. Nope, still lose my serial output once the kernel invokes init.
The kernel command line is correct, serial output works for all of printk. I haven't gone gang-busters in modifying the command line since moving from Redboot to u-boot anyway, just some mtdpart changes that were necessary for u-boot/env.
If I boot using Redboot instead of u-boot (same kernel, just lacking the argc/argv patch), serial output from userspace is working:
[ 1.664000] VFS: Mounted root (squashfs filesystem) readonly on device 31:3. [ 1.695000] devtmpfs: mounted [ 1.703000] Freeing unused kernel memory: 292K init started: BusyBox v1.25.1 (2018-08-24 12:51:28 PDT) BusyBox v1.25.1 (2018-08-24 12:51:28 PDT) built-in shell (ash)
But this. doesn't. make. sense. The bootloader shouldn't have anything to do with the handover between the kernel and init/userspace. Kernel printk works in both cases, but only in the case of u-boot, is serial output broken once the kernel invokes init. The load address of the kernel differs between Redboot (0x80100000) and u-boot (0x81000000), but if I was somehow messing up the memory map, I'd expect to see zero output from printk and (more likely) a kernel panic during boot.
I did configure getty and it is supposed to be listening on ttyS0, but since there's not even any serial output from init, getty isn't accessible:
# Put a getty on the serial port ttyS0::respawn:/sbin/getty -L ttyS0 115200 vt100 # GENERIC_SERIAL
I cannot figure out why I'm losing the serial console when the kernel invokes init. What am I missing here?
Here is the kernel .config