Giter Site home page Giter Site logo

Comments (17)

ranma avatar ranma commented on July 24, 2024

Tried the edge kernel image (Linux DietPi 6.8.11-edge-rockchip64 #1 SMP PREEMPT Sat May 25 14:28:41 UTC 2024 aarch64 GNU/Linux), but it shows the same issue.

from dietpi.

MichaIng avatar MichaIng commented on July 24, 2024

Probably the same like here (at least same SoC): #6951

Through, our error messages seem to indicate a different driver issue. Does it work on cold boot/power cycle?

I just triggered a rebuild, with Linux 6.6.33, just in case it was some intermediate issue: https://github.com/MichaIng/DietPi/actions/runs/9521706157/job/26249731931

Once this has finished, to test it:

cd /tmp
wget https://dietpi.com/downloads/binaries/testing/linux-{image,dtb}-current-rockchip64.deb
dpkg -i linux-{image,dtb}-current-rockchip64.deb
reboot

... just found a matching report at the Armbian forum, but from March with Linux 6.1: https://forum.armbian.com/topic/35560-lost-network-in-rock-3a-after-upgrading-to-2421/
You should have had Linux 6.6 already, 6.6.31, isn't it? Strange that we did not receive any other report, if this issue was that old 🤔.

from dietpi.

ranma avatar ranma commented on July 24, 2024

Does it work on cold boot/power cycle?

No, cold boot/power cycle doesn't help at all.

You should have had Linux 6.6 already, 6.6.31, isn't it?

I'm not sure what the previous kernel version was before the 9.5 update. Let me check if its in the systemd logs somewhere...
No, looks like the journal isn't persisted. Should have made a disk image before attempting to upgrade...

Compared to bullseye I see some phy-related devicetree diffs:

        ethernet@fe010000 {
@@ -943,35 +970,36 @@
                resets = <0x0e 0xec>;
                reset-names = "stmmaceth";
                rockchip,grf = <0x1f>;
-               snps,axi-config = <0x4d>;
+               snps,axi-config = <0x4e>;
                snps,mixed-burst;
-               snps,mtl-rx-config = <0x4e>;
-               snps,mtl-tx-config = <0x4f>;
+               snps,mtl-rx-config = <0x4f>;
+               snps,mtl-tx-config = <0x50>;
                snps,tso;
                status = "okay";
-               snps,reset-gpio = <0x50 0x08 0x01>;
-               snps,reset-active-low;
-               snps,reset-delays-us = <0x00 0x4e20 0x186a0>;
                assigned-clocks = <0x0e 0x189 0x0e 0x186>;
                assigned-clock-parents = <0x0e 0x187 0x51>;
                clock_in_out = "input";
                phy-handle = <0x52>;
-               phy-mode = "rgmii";
+               phy-mode = "rgmii-id";
+               phy-supply = <0x1c>;
                pinctrl-names = "default";
                pinctrl-0 = <0x53 0x54 0x55 0x56 0x57 0x58>;
-               tx_delay = <0x4f>;
-               rx_delay = <0x26>;
-               phandle = <0xf0>;
+               phandle = <0x100>;
 
                mdio {
                        compatible = "snps,dwmac-mdio";
                        #address-cells = <0x01>;
                        #size-cells = <0x00>;
-                       phandle = <0xf1>;
+                       phandle = <0x101>;
 
                        ethernet-phy@0 {
                                compatible = "ethernet-phy-ieee802.3-c22";
                                reg = <0x00>;
+                               pinctrl-names = "default";
+                               pinctrl-0 = <0x59>;
+                               reset-assert-us = <0x4e20>;
+                               reset-deassert-us = <0x186a0>;
+                               reset-gpios = <0x5a 0x08 0x01>;
                                phandle = <0x52>;
                        };
                };

I just triggered a rebuild, with Linux 6.6.33, just in case it was some intermediate issue
Still the same:

[    5.329669] rk_gmac-dwmac fe010000.ethernet: Can not read property: tx_delay.
[    5.330332] rk_gmac-dwmac fe010000.ethernet: set tx_delay to 0x30
[    5.330891] rk_gmac-dwmac fe010000.ethernet: Can not read property: rx_delay.
[    5.335858] rk_gmac-dwmac fe010000.ethernet: set rx_delay to 0x10
[    5.381975] mdio_bus stmmac-0: MDIO device at address 0 is missing.
[   10.750139] rk_gmac-dwmac fe010000.ethernet eth0: __stmmac_open: Cannot attach to PHY (error: -19)
[...]
root@DietPi:~# uname -a
Linux DietPi 6.6.33-current-rockchip64 #1 SMP PREEMPT Wed Jun 12 09:13:03 UTC 2024 aarch64 GNU/Linux

This patch looks suspiciously like the reverse of the devicetree diff I see: https://github.com/armbian/build/blob/v24.2.1/patch/kernel/archive/rockchip64-6.1/board-rock3a-gmac1.patch

I need to try booting with the older devicetree image or that particular patch applied.

from dietpi.

MichaIng avatar MichaIng commented on July 24, 2024

This patch looks suspiciously like the reverse of the devicetree diff I see:

But this patch is applied for 6.1/legacy builds only, not 6.6. There is no such patch for 6.6: https://github.com/armbian/build/tree/v24.2.1/patch/kernel/archive/rockchip64-6.6

Here is how to downgrade:

apt install linux-{image,dtb}-current-rockchip64=24.2.1

This is Linux 6.6.16, and likely matches the version/dtb from the Bullseye image you checked, since our Bullseye and Bookworm images share the same kernel. So we'd need to check for commits done between those version. Hmm, I do not see any in mainline Linux: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/log/arch/arm64/boot/dts/rockchip/rk3568-rock-3a.dts?h=linux-6.6.y
Also not in the parent dts/i files. And also not on Armbian 🤔: https://github.com/armbian/build/commits/main/patch/kernel/archive/rockchip64-6.6

With which Bullseye image did you compare it?

mdio_bus stmmac-0: MDIO device at address 0 is missing.

So the diff in the mdio node seems suspicious, indeed. Mainline Linux matches the "+" side of your diff: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/arch/arm64/boot/dts/rockchip/rk3568-rock-3a.dts?h=linux-6.6.y#n584
This is the current image/kernel, while "-" is the Bullseye one? The diff indeed matches the patch you linked. However, since this Linux 6.6.16 kernel is live since February already, and we did not have any other report, and this patch was never applied, it cannot be the relevant thing here (unless I miss something).

from dietpi.

ranma avatar ranma commented on July 24, 2024

With which Bullseye image did you compare it?

$ ls -l DietPi_ROCK3A-ARMv8-Bullseye.7z
-rw-r--r-- 1 ranma ranma 141095789 28. Dez 2022  DietPi_ROCK3A-ARMv8-Bullseye.7z

which contains Linux version 6.0.10-rk35xx (root@4896770df8c8) (aarch64-linux-gnu-gcc (GNU Toolchain for the A-profile Architecture 8.3-2019.03 (arm-rel-8.36)) 8.3.0, GNU ld (GNU Toolchain for the A-profile Architecture 8.3-2019.03 (arm-rel-8.36)) 2.32.0.20190321) #22.11.1 SMP PREEMPT Wed Nov 30 11:08:41 UTC 2022.

The system was in my parents basement for a while, so this may have been my first upgrade attempt to a newer version actually...

from dietpi.

MichaIng avatar MichaIng commented on July 24, 2024

Ah okay, a very old one. Then the diff makes sense, regarding the patch. However, exactly that patch was suspected by the guy on the Armbian forum to cause the issue (with Linux 6.1) instead of solving it 😄. There were definitely some versions without the patch which had no Ethernet issues, at least not reported by anyone else. Did the downgrade to v24.2.1 work?

from dietpi.

ranma avatar ranma commented on July 24, 2024

So with the current bullseye image I'm getting:

[    0.000000] Linux version 6.6.32-current-rockchip64 (armbian@next) (aarch64-linux-gnu-gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0, GNU ld (GNU Binutils for Ubuntu) 2.38) #1 SMP PREEMPT Sat May 25 14:22:56 UTC 2024
[...]
[    5.166389] rk_gmac-dwmac fe010000.ethernet: IRQ eth_lpi not found
[    5.172788] rk_gmac-dwmac fe010000.ethernet: clock input or output? (input).
[    5.173467] rk_gmac-dwmac fe010000.ethernet: Can not read property: tx_delay.
[    5.174113] rk_gmac-dwmac fe010000.ethernet: set tx_delay to 0x30
[    5.174714] rk_gmac-dwmac fe010000.ethernet: Can not read property: rx_delay.
[    5.175370] rk_gmac-dwmac fe010000.ethernet: set rx_delay to 0x10
[    5.175942] rk_gmac-dwmac fe010000.ethernet: integrated PHY? (no).
[    5.176566] rk_gmac-dwmac fe010000.ethernet: clock input from PHY
[    5.182147] rk_gmac-dwmac fe010000.ethernet: init for RGMII_ID
[    5.183228] rk_gmac-dwmac fe010000.ethernet: User ID: 0x30, Synopsys ID: 0x51
[    5.183913] rk_gmac-dwmac fe010000.ethernet:         DWMAC4/5
[    5.184389] rk_gmac-dwmac fe010000.ethernet: DMA HW capability register supported
[    5.185070] rk_gmac-dwmac fe010000.ethernet: RX Checksum Offload Engine supported
[    5.185749] rk_gmac-dwmac fe010000.ethernet: TX Checksum insertion supported
[    5.186386] rk_gmac-dwmac fe010000.ethernet: Wake-Up On Lan supported
[    5.190350] rk_gmac-dwmac fe010000.ethernet: TSO supported
[    5.190922] rk_gmac-dwmac fe010000.ethernet: Enable RX Mitigation via HW Watchdog Timer
[    5.191665] rk_gmac-dwmac fe010000.ethernet: Enabled RFS Flow TC (entries=10)
[    5.192320] rk_gmac-dwmac fe010000.ethernet: TSO feature enabled
[    5.192873] rk_gmac-dwmac fe010000.ethernet: Using 32/32 bits DMA host/device width

where with 6.6.33 and verbose logging I get:

[    0.000000] Linux version 6.6.33-current-rockchip64 (armbian@next) (aarch64-linux-gnu-gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0, GNU ld (GNU Binutils for Ubuntu) 2.38) #1 SMP PREEMPT Wed Jun 12 09:13:03 UTC 2024
[...]
[    5.362274] rk_gmac-dwmac fe010000.ethernet: IRQ eth_lpi not found
[    5.364069] rk_gmac-dwmac fe010000.ethernet: clock input or output? (input).
[    5.364744] rk_gmac-dwmac fe010000.ethernet: Can not read property: tx_delay.
[    5.365390] rk_gmac-dwmac fe010000.ethernet: set tx_delay to 0x30
[    5.365438] rk_gmac-dwmac fe010000.ethernet: Can not read property: rx_delay.
[    5.365472] rk_gmac-dwmac fe010000.ethernet: set rx_delay to 0x10
[    5.365515] rk_gmac-dwmac fe010000.ethernet: integrated PHY? (no).
[    5.366319] rk_gmac-dwmac fe010000.ethernet: clock input from PHY
[    5.375982] rk_gmac-dwmac fe010000.ethernet: init for RGMII_ID
[    5.381268] rk_gmac-dwmac fe010000.ethernet: User ID: 0x30, Synopsys ID: 0x51
[    5.381968] rk_gmac-dwmac fe010000.ethernet:         DWMAC4/5
[    5.382512] rk_gmac-dwmac fe010000.ethernet: DMA HW capability register supported
[    5.383211] rk_gmac-dwmac fe010000.ethernet: RX Checksum Offload Engine supported
[    5.383893] rk_gmac-dwmac fe010000.ethernet: TX Checksum insertion supported
[    5.384531] rk_gmac-dwmac fe010000.ethernet: Wake-Up On Lan supported
[    5.385272] rk_gmac-dwmac fe010000.ethernet: TSO supported
[    5.385792] rk_gmac-dwmac fe010000.ethernet: Enable RX Mitigation via HW Watchdog Timer
[    5.386556] rk_gmac-dwmac fe010000.ethernet: Enabled RFS Flow TC (entries=10)
[    5.387220] rk_gmac-dwmac fe010000.ethernet: TSO feature enabled
[    5.387773] rk_gmac-dwmac fe010000.ethernet: Using 32/32 bits DMA host/device width
[    5.389686] mdio_bus stmmac-0: MDIO device at address 0 is missing.

from dietpi.

MichaIng avatar MichaIng commented on July 24, 2024

Our Bullseye and Bookworm use the same kernel, as I said, so they cannot behave any different. What I meant is downgrading to an older Linux version provided by the Armbian APT repo. You can test this on Bookworm or Bullseye, doesn't matter:

apt install linux-{image,dtb}-current-rockchip64=24.2.1

But just to be sure, since I see no eth0: __stmmac_open: Cannot attach to PHY (error: -19) in above logs, eth0 still does not come up, and it is not listed as interface, is it?

ip link

EDIT: Ah, I misinterpreted the report at the Armbian forum. In that case, the Linux 6.1 version with the patch worked, and Linux 6.6.16 without the patch did not. So similar to you case (if Linux 6.6.16 after above downgrade still has the same issue). However, it then seems to affect rare cases only, maybe a particular bad board revision or batch 🤔.

from dietpi.

ranma avatar ranma commented on July 24, 2024

Did the downgrade to v24.2.1 work?

I just tried the downgrade, but still get the error:

oot@DietPi:~# dpkg -l | grep [r]ockchip
ii  linux-dtb-current-rockchip64     24.2.1                         arm64        Armbian Linux current DTBs in /boot/dtb-6.6.16-current-rockchip64
ii  linux-image-current-rockchip64   24.2.1                         arm64        Armbian Linux current kernel image 6.6.16-current-rockchip64
rc  linux-image-edge-rockchip64      24.5.1                         arm64        Armbian Linux edge kernel image 6.8.11-edge-rockchip64
root@DietPi:~# reboot
[...]
[    0.000000] Linux version 6.6.16-current-rockchip64 (armbian@next) (aarch64-linux-gnu-gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0, GNU ld (GNU Binutils for Ubuntu) 2.38) #1 SMP PREEMPT Fri Feb 23 08:25:28 UTC 2024
[...]
[   11.326067] rk_gmac-dwmac fe010000.ethernet eth0: __stmmac_open: Cannot attach to PHY (error: -19)

Our Bullseye and Bookworm use the same kernel, as I said, so they cannot behave any different.

For the bullseye image I just downloaded I see a working 6.6.32-current-rockchip64 kernel though, while the one downgraded to 24.2.1 is 6.6.16-current-rockchip64, so its not the same.

This was DietPi_ROCK3A-ARMv8-Bullseye.img.xz 2024-05-13 03:30 180M,

$ sha1sum DietPi_ROCK3A-ARMv8-Bullseye.img.xz 
421e87a41e21d712c4f70afd114992656f637c80  DietPi_ROCK3A-ARMv8-Bullseye.img.xz

though this was booted from sdcard as a fresh image.

with the half-upgraded current state the eMMC is in it says:

Armbian 23.8.1 bullseye ttyS2 
[...]
 DietPi v9.5.1 : 22:18 - Fri 06/14/24

from dietpi.

MichaIng avatar MichaIng commented on July 24, 2024

For the bullseye image I just downloaded I see a working 6.6.32-current-rockchip64 kernel though

You mean that one has no Ethernet issue? This is the exact same kernel used on the Bookworm image.

while the one downgraded to 24.2.1 is 6.6.16-current-rockchip64, so its not the same.

Our current images (both: Bookworm and Bullseye) are shipped with 6.6.32-current-rockchip64, since this is the most recent one pushed to the Armbian APT repo. Our previous images (both) shipped with 6.6.16-current-rockchip64. And before you upgraded your system form DietPi v9.4 to v9.5, you should have had 6.6.16-current-rockchip64 as well, which is available since February.

Hence I am confused that downgrading to the exact same kernel version that worked fine for you already (and worked for months for everyone else as well) does not solve the issue.

Just to rule it out, does this have any effect?

sed -i '/^[^#].*network-pre.target/s/^/#/' /etc/systemd/system/ifupdown-pre.service.d/dietpi.conf
systemctl daemon-reload
reboot

Maybe it is some strange timing issue.

from dietpi.

ranma avatar ranma commented on July 24, 2024

I think its actually u-boot that is different. The bad boots were all from eMMC (which I was trying to repair) and the good boots are fresh installs from microSD.

The full boot log diff shows:

--- bad_boot_no_net.coldboot.log        2024-06-15 08:45:10.266151576 +0200
+++ good_coldboot_from_microsd.log      2024-06-15 08:52:11.706786999 +0200
[...]
@@ -95,25 +93,26 @@
 INFO:    SPSR = 0x3c9
 
 
-U-Boot 2017.09-armbian (Nov 30 2022 - 10:41:48 +0000)
+U-Boot 2017.09-armbian (May 20 2024 - 00:41:52 +0000)
 
 Model: Radxa ROCK3 Model A
 PreSerial: 2, raw, 0xfe660000
 DRAM:  2 GiB
 Sysmem: init
[...]
@@ -133,50 +132,42 @@
   dpll 780000 KHz
   gpll 1188000 KHz
   cpll 1000000 KHz
-  npll 24000 KHz
+  npll 1200000 KHz
   vpll 24000 KHz
   hpll 24000 KHz
   ppll 200000 KHz
   armclk 816000 KHz
   aclk_bus 150000 KHz
-  pclk_bus 50000 KHz
+  pclk_bus 100000 KHz
   aclk_top_high 300000 KHz
   aclk_top_low 200000 KHz
   hclk_top 150000 KHz
-  pclk_top 50000 KHz
+  pclk_top 100000 KHz
   aclk_perimid 300000 KHz
   hclk_perimid 150000 KHz
   pclk_pmu 100000 KHz
 No misc partition
-Net:   No ethernet found.
+Net:   eth1: ethernet@fe010000
 Hit key to stop autoboot('CTRL+C'):  0 
[...]

Full boot logs from serial port:
bad_boot_no_net.coldboot.log
bad_boot_no_net.warmboot_from_working_system.log
good_coldboot_from_microsd.log

from dietpi.

ranma avatar ranma commented on July 24, 2024

Copying the newer u-boot from microsd to emmc:

# fdisk -l /dev/mmcblk0
Disk /dev/mmcblk0: 14.56 GiB, 15634268160 bytes, 30535680 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 7A7207A9-21C7-462D-A919-D0FFE923F28D

Device          Start      End  Sectors  Size Type
/dev/mmcblk0p1  32768   294911   262144  128M Microsoft basic data
/dev/mmcblk0p2 294912 30535646 30240735 14.4G Linux filesystem
# dd if=/dev/mmcblk1 of=/dev/mmcblk0 seek=64 skip=64 count=32704
32704+0 records in
32704+0 records out
16744448 bytes (17 MB, 16 MiB) copied, 1.2801 s, 13.1 MB/s

And indeed after a reboot the network is back to working now :)

from dietpi.

ranma avatar ranma commented on July 24, 2024

I kept a copy of the old u-boot in case you want to try to reproduce using that :)
diskhdr.zip

$ strings diskhdr.img  | grep armbian
U-Boot SPL 2017.09-armbian (Nov 30 2022 - 10:41:48)
U-Boot 2017.09-armbian
U-Boot 2017.09-armbian (Nov 30 2022 - 10:41:48 +0000)

from dietpi.

MichaIng avatar MichaIng commented on July 24, 2024

Great debugging, shown that the adapter is missing already in U-Boot stage, indicating a U-Boot issue, not kernel issue. Still strange, that this issue appeared for you after a kernel upgrade.

The U-Boot package is upgraded via APT, but not flashed automatically. You can do this via dietpi-config advanced options as well. It will flash it to the current root drive, i.e. that way you cannot flash it to eMMC when having booted from SD card. There is the /usr/lib/u-boot/platform_install.sh script to do this.

from dietpi.

ranma avatar ranma commented on July 24, 2024

Still strange, that this issue appeared for you after a kernel upgrade.

I'm assuming it might be something like the older kernel had a workaround for the older u-boot and something got fixed or changed in the newer u-boot so that workaround was dropped in newer kernels.

from dietpi.

MichaIng avatar MichaIng commented on July 24, 2024

Whatever it was, good that you cannot replicate it anymore with current U-Boot and kernel. I'll mark this hence as closed. I'll keep it in mind if others run into the same issue, in which case we can apply a patch to enforce the U-Boot upgrade.

from dietpi.

josacar avatar josacar commented on July 24, 2024

Thanks for this, I got this issue in Armbian and I solved patching the U-Boot following this issue.

from dietpi.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.