Giter Site home page Giter Site logo

marin-m / vmlinux-to-elf Goto Github PK

View Code? Open in Web Editor NEW
1.2K 28.0 122.0 1.34 MB

A tool to recover a fully analyzable .ELF from a raw kernel, through extracting the kernel symbol table (kallsyms)

License: GNU General Public License v3.0

Python 100.00%
reverse-engineering linux-kernel linux vmlinux elf firmware-analysis

vmlinux-to-elf's People

Contributors

apoch-zxv avatar bmax121 avatar clubby789 avatar cr4sh avatar lucasg avatar marin-m avatar myldero avatar screwer avatar sgfvamll avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

vmlinux-to-elf's Issues

Option for increasing rom size

By default IDA suggest ROM size as Loading size, but it's not good. Better increase ROM size for handling virtual values, which in case ROM == Loading show as red. Increasing ROM size will help to get rid of that.

The offset of functions by kallsyms_finder.py is error

Using the kallsyms_finder.py with boot.img in this mi11 rom

$ python3 kallsyms_finder.py ../../../xiaomi/mi11/boot.img
[+] Kernel successfully decompressed in-memory (the offsets that follow will be given relative to the decompressed binary)
[+] Version string: Linux version 5.4.61-qgki-g0816492c5df1 ([email protected]) (Android (6443078 based on r383902) clang version 11.0.1 (https://android.googlesource.com/toolchain/llvm-project b397f81060ce6d701042b782172ed13bee898b79), LLD 11.0.1 (/buildbot/tmp/tmp6_m7QH b397f81060ce6d701042b782172ed13bee898b79)) #1 SMP PREEMPT Wed May 26 03:01:02 CST 2021
[+] Guessed architecture: aarch64 successfully in 0.00 seconds
[+] Found relocations table at file offset 0x2853a68 (count=119634)
[+] Found kernel text candidate: 0xffffffc010000000
WARNING! bad rela offset ffffffc012dd6850
[+] Found kallsyms_token_table at file offset 0x01fadb68
[+] Found kallsyms_token_index at file offset 0x01fade90
[+] Found kallsyms_markers at file offset 0x01fad0a0
[+] Found kallsyms_names at file offset 0x01bfcc18
[+] Found kallsyms_num_syms at file offset 0x01bfcc10
[i] Negative offsets overall: 0 %
[i] Null addresses overall: 0 %
[+] Found kallsyms_offsets at file offset 0x01b50a00

The error offset is xxx different than the correct one

The architecture of your kernel could not be guessed successfully.

$ vmlinux-to-elf vmlinuz-3.13.11-1-amd64-vyos vyos.elf
[+] Kernel successfully decompressed in-memory (the offsets that follow will be given relative to the decompressed binary)
[+] Version string: Linux version 3.13.11-1-amd64-vyos (jenkins@squeeze64devel) (gcc version 4.4.5 (Debian 4.4.5-8) ) #1 SMP Wed Aug 12 02:08:05 UTC 2015
[!] The architecture of your kernel could not be guessed successfully. Please specify the --e-machine and --bit-size arguments manually (use --help for their precise specification).

I'm going to analyse kernel of vyos on IDA pro, But get the above error. I wanna know why. Is that about the wired linux version that vmlinux-to-elf cannot identify? The following is the information of "vmlinuz-3.13.11-1-amd64-vyos"

$ file vmlinuz-3.13.11-1-amd64-vyos
vmlinuz-3.13.11-1-amd64-vyos: Linux kernel x86 boot executable bzImage, version 3.13.11-1-amd64-vyos (jenkins@squeeze64devel) #1 SMP Wed Aug 12 02:08:05 UTC 2015, RO-rootFS, swap_dev 0x3, Normal VGA

raw elf format for embedded devices

Hello,

I tried to run vmlinux-to-elf for a raw elf format that I obtained from a Xilinx Zynq bootloader, but it can't detect the format.

can you please take a look at it? I'm trying to analyze it with IDA but I can't get the image right.

bootfiles.zip

inside there are 3 files :
boot.bin: it's a Xilinx bootloader with their own format which is generated with their "bootgen" utility.
here is the pdf from their website explaining the format.
based on the file I made a tool to extract the partitions from it and I got 2 .elf files.
the fsbl_nand.elf file is the first stage bootloader and the other elf file is the secondary bootloader which handles the firmware updates and safe mode and launches the main application.

here is the tool I made to analyze the file and extract the elf images: it's in java (java -jar XilinxBIFTool.jar boot.bin)
XilinxBIFTool.zip

If I binwalk the bigger elf file I get some gzip sections but I don't know if that's right.

I also tried arm-none-eabi-objcopy --input-target=binary --output-target=elf32-littlearm fsbl_nand.elf test.elf
and tried to verify it with
arm-none-eabi-readelf -a test.elf
it does give some information however still not a valid elf file. i think the objcopy can't verify the input.

do you know any other way to build the final elf file?
any guidance is greatly appreciated!

armbe kernel processing failure

Source: https://downloads.openwrt.org/releases/17.01.0/targets/ixp4xx/harddisk/lede-17.01.0-r3205-59508e3-ixp4xx-harddisk-ap1000-zImage

[+] Kernel successfully decompressed in-memory (the offsets that follow will be given relative to the decompressed binary)
[+] Version string: Linux version 4.4.50 ([email protected]) (gcc version 5.4.0 (LEDE GCC 5.4.0 r3103-1b51a49) ) #0 Mon Feb 20 17:13:44 2017
[+] Guessed architecture: armbe successfully in 0.81 seconds
Traceback (most recent call last):
  File "C:\Work\git\vmlinux-to-elf\vmlinux_to_elf\kallsyms_finder.py", line 200, in __init__
    self.find_kallsyms_names_uncompressed()
  File "C:\Work\git\vmlinux-to-elf\vmlinux_to_elf\kallsyms_finder.py", line 540, in find_kallsyms_names_uncompressed
    raise KallsymsNotFoundException('No embedded symbol table found in this kernel')
kallsyms_finder.KallsymsNotFoundException: No embedded symbol table found in this kernel

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Work\git\vmlinux-to-elf\vmlinux_to_elf\main.py", line 67, in <module>
    args.base_address, args.file_offset
  File "C:\Work\git\vmlinux-to-elf\vmlinux_to_elf\elf_symbolizer.py", line 44, in __init__
    kallsyms_finder = KallsymsFinder(file_contents, bit_size)
  File "C:\Work\git\vmlinux-to-elf\vmlinux_to_elf\kallsyms_finder.py", line 205, in __init__
    raise first_error
  File "C:\Work\git\vmlinux-to-elf\vmlinux_to_elf\kallsyms_finder.py", line 193, in __init__
    self.find_kallsyms_token_table()
  File "C:\Work\git\vmlinux-to-elf\vmlinux_to_elf\kallsyms_finder.py", line 423, in find_kallsyms_token_table
    raise KallsymsNotFoundException('%d candidates for kallsyms_token_table in kernel image' % len(candidates_offsets))
kallsyms_finder.KallsymsNotFoundException: 0 candidates for kallsyms_token_table in kernel image

BTW, can you add a "fallaback" mode which writes a decompressed image (with or without ELF headers) on disk if kallsyms finder fails?

ValueError: 2 candidates for kallsyms_token_table in kernel image

Hello, I encountered a problem when using this tool, the output is as follows:

$ vmlinux-to-elf kernel.uimage kernel.elf
[+] Kernel successfully decompressed in-memory (the offsets that follow will be given relative to the decompressed binary)
[+] Version string: Linux version 2.6.32.61-EMBSYS-CGEL-4.03.20.P1.F0 (root@A000001) (gcc version 4.5.2 2015-05-12 ZTE Embsys-TSP Vn (GCC) ) #102 Thu Jan 25 05:19:23 CST 2018
[+] Guessed architecture: armle successfully in 3.70 seconds
Traceback (most recent call last):
File "/usr/local/bin/vmlinux-to-elf", line 67, in
args.base_address, args.file_offset
File "/usr/local/bin/vmlinux_to_elf/elf_symbolizer.py", line 44, in init
kallsyms_finder = KallsymsFinder(file_contents, bit_size)
File "/usr/local/bin/vmlinux_to_elf/kallsyms_finder.py", line 193, in init
self.find_kallsyms_token_table()
File "/usr/local/bin/vmlinux_to_elf/kallsyms_finder.py", line 411, in find_kallsyms_token_table
raise ValueError('%d candidates for kallsyms_token_table in kernel image' % len(candidates_offsets))
ValueError: 2 candidates for kallsyms_token_table in kernel image

Could not find kallsyms_names

find_kallsyms_num_syms() loops forever with boot.img in this Pixel factory image.

~/bin/vmlinux-to-elf/kallsyms-finder ./boot.img
[+] Kernel successfully decompressed in-memory (the offsets that follow will be given relative to the decompressed binary)
[+] Kernel successfully decompressed in-memory (the offsets that follow will be given relative to the decompressed binary)
[+] Version string: Linux version 4.14.170-g5513138224ab-ab6570431 (android-build@abfarm-east4-052) (Android (5484270 based on r353983c) clang version 9.0.3 (https://android.googlesource.com/toolchain/clang 745b335211bb9eadfa6aa6301f84715cee4b37c5) (https://android.googlesource.com/toolchain/llvm 60cf23e54e46c807513f7a36d0a7b777920b5881) (based on LLVM 9.0.3svn)) #1 SMP PREEMPT Tue Jun 9 02:18:01 UTC 2020
[+] Guessed architecture: aarch64 successfully in 4.52 seconds
[+] Found relocations table at file offset 0x2577ec0 (count=154559)
[+] Found kernel text candidate: 0xffffff8008300000
[+] Successfully applied 154559 relocations.
[+] Found kallsyms_token_table at file offset 0x0210c000
[+] Found kallsyms_token_index at file offset 0x0210c400
[+] Found kallsyms_markers at file offset 0x0210ab00

needle is -1 even after executing loop for 5M times.
Where should I start to debug for this?

ValueError: 3 candidates for kallsyms_token_table in kernel image

Hello, I found that a new kernel image will have the same error #21 :

$ vmlinux-to-elf kernel.bin kernel.elf
[+] Kernel successfully decompressed in-memory (the offsets that follow will be given relative to the decompressed binary)
[+] Version string: Linux version 2.6.32.61-EMBSYS-CGEL-4.03.20.P1.F0 (wuxianglong@zxric-nb) (gcc version 4.1.2 2011-06-24 ZTE Embsys-TSP V2.08.20_P2) #108 Fri Apr 26 20:57:18 CST 2019
[+] Guessed architecture: armle successfully in 3.96 seconds
Traceback (most recent call last):
File "/usr/local/bin/vmlinux-to-elf", line 67, in
args.base_address, args.file_offset
File "/usr/local/bin/vmlinux_to_elf/elf_symbolizer.py", line 44, in init
kallsyms_finder = KallsymsFinder(file_contents, bit_size)
File "/usr/local/bin/vmlinux_to_elf/kallsyms_finder.py", line 193, in init
self.find_kallsyms_token_table()
File "/usr/local/bin/vmlinux_to_elf/kallsyms_finder.py", line 417, in find_kallsyms_token_table
raise ValueError('%d candidates for kallsyms_token_table in kernel image' % len(candidates_offsets))
ValueError: 3 candidates for kallsyms_token_table in kernel image

Regards.

kallsyms_finder.KallsymsNotFoundException: No embedded symbol table found in this kernel

An error is reported when extracting an arm architecture kernel

(base) โžœ  ex /home/tower/tools/vmlinux-to-elf/vmlinux-to-elf --e-machine 9 --bit-size 32 ./uImage ./vmlinux_elf
[+] Kernel successfully decompressed in-memory (the offsets that follow will be given relative to the decompressed binary)
[+] Version string: Linux version 4.9.84 (jenkins@Vcilnx2) (gcc version 4.9.4 (Buildroot 2017.08-gc7bbae9-dirty) ) #1 PREEMPT Thu Mar 30 17:38:36 CST 2023
Traceback (most recent call last):
  File "/home/tower/tools/vmlinux-to-elf/vmlinux_to_elf/kallsyms_finder.py", line 201, in __init__
    self.find_kallsyms_names_uncompressed()
  File "/home/tower/tools/vmlinux-to-elf/vmlinux_to_elf/kallsyms_finder.py", line 544, in find_kallsyms_names_uncompressed
    raise KallsymsNotFoundException('No embedded symbol table found in this kernel')
kallsyms_finder.KallsymsNotFoundException: No embedded symbol table found in this kernel

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/tower/tools/vmlinux-to-elf/vmlinux-to-elf", line 63, in <module>
    ElfSymbolizer(
  File "/home/tower/tools/vmlinux-to-elf/vmlinux_to_elf/elf_symbolizer.py", line 44, in __init__
    kallsyms_finder = KallsymsFinder(file_contents, bit_size)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tower/tools/vmlinux-to-elf/vmlinux_to_elf/kallsyms_finder.py", line 206, in __init__
    raise first_error
  File "/home/tower/tools/vmlinux-to-elf/vmlinux_to_elf/kallsyms_finder.py", line 194, in __init__
    self.find_kallsyms_token_table()
  File "/home/tower/tools/vmlinux-to-elf/vmlinux_to_elf/kallsyms_finder.py", line 425, in find_kallsyms_token_table
    raise KallsymsNotFoundException('%d candidates for kallsyms_token_table in kernel image' % len(candidates_offsets))
kallsyms_finder.KallsymsNotFoundException: 0 candidates for kallsyms_token_table in kernel image
(base) โžœ  ex file ./uImage
./uImage: u-boot legacy uImage, Linux-4.9.84, Linux/ARM, OS Kernel Image (Not compressed), 1429376 bytes, Thu Mar 30 09:38:42 2023, Load Address: 0x20008000, Entry Point: 0x20008000, Header CRC: 0xE5DD206F, Data CRC: 0x0C8C1F3C
(base) โžœ  ex binwalk uImage

DECIMAL       HEXADECIMAL     DESCRIPTION
--------------------------------------------------------------------------------
0             0x0             uImage header, header size: 64 bytes, header CRC: 0xE5DD206F, created: 2023-03-30 09:38:42, image size: 1429376 bytes, Data Address: 0x20008000, Entry Point: 0x20008000, data CRC: 0xC8C1F3C, OS: Linux, CPU: ARM, image type: OS Kernel Image, compression type: none, image name: "Linux-4.9.84"
14947         0x3A63          xz compressed data
15281         0x3BB1          xz compressed data

(base) โžœ  ex

Could not guess the architecture register size for kernel

I encountered the following error with commit f86fff0, as trying to extract kallsyms from vmlinux of https://2022.ctf.link/internal/challenge/0a8dd8e2-de2f-4eff-97d7-643c6f28dc22.

# kallsyms-finder ./vmlinux 
[+] Version string: Linux version 6.2.2 (sisu@sisu-ThinkPad-E14-Gen-2) (gcc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0, GNU ld (GNU Binutils for Ubuntu) 2.38) # SMP PREEMPT_DYNAMIC 
[+] Guessed architecture: x86_64 successfully in 3.74 seconds
[+] Found kallsyms_token_table at file offset 0x018e75b0
[+] Found kallsyms_token_index at file offset 0x018e7950
Traceback (most recent call last):
  File "/usr/local/bin/kallsyms-finder", line 1145, in <module>
    kallsyms = KallsymsFinder(obtain_raw_kernel_from_file(kernel_bin.read()), args.bit_size)
  File "/usr/local/bin/kallsyms-finder", line 208, in __init__
    self.find_kallsyms_markers()
  File "/usr/local/bin/kallsyms-finder", line 720, in find_kallsyms_markers
    raise ValueError('Could not guess the architecture register ' +
ValueError: Could not guess the architecture register size for kernel

(I extract vmlinux from bzImage with https://raw.githubusercontent.com/torvalds/linux/v5.15/scripts/extract-vmlinux. Make sure that lzop and zstd are also installed.)

I cannot figure out how to solve this.
Simply setting max_number_of_space_between_two_nulls is one of 2, 4 or 8, another error occurs ...: Could not find kallsyms_markers

Support uImage format and/or manual arch specification

First, congrats on the awesome tool.

I decided to try it out and went to the OpenWRT release archive. The first one alphabetically was ARC and it failed:

  File "C:\Work\git\vmlinux-to-elf\vmlinux_to_elf\architecture_detecter.py", line 157, in guess_architecture
    raise ValueError('The architecture could not be guessed successfully')
ValueError: The architecture could not be guessed successfully

In fact, the uImage header already includes the architecture, load address and even entrypoint:

openwrt-18.06.4-arc770-generic-uImage: u-boot legacy uImage, ARC OpenWrt Linux-4.9.184, Linux/DesignWare ARC, OS Kernel Image (Not compressed), 4522192 bytes, Thu Jun 27 12:18:52 2019, Load Address: 0x80000000, Entry Point: 0x8000A000, Header CRC: 0xA11EF4A4, Data CRC: 0xAC4BE39B

Additionally, there is no need to know the architecture if not writing out the ELF file (e.g. when just dumping symbols), so this step could be skipped until required. You could also let user specify it manually or just write 0 to e_machine.

Note: uImage format may employ its own compression (seen at least gzip used).

Problem with Android 10: 'struct.error: argument out of range'

Hello, there is unfortunately an issue with some android 10 kernels:

[+] Version string: Linux version 4.14.117-perf+ (OnePlus@rd-build-91) (clang version 8.0.8 for Android NDK) #1 SMP PREEMPT Mon Sep 7 21:07:28 CST 2020
[+] Guessed architecture: aarch64 successfully in 13.89 seconds
[+] Found relocations table at file offset 0x1c63128 (count=107056)
[+] Found kernel text candidate: 0xffffff8008080000
Traceback (most recent call last):
  File "/home/itz63c/dumpyara/vmlinux-to-elf/vmlinux_to_elf/kallsyms_finder.py", line 1102, in <module>
    kallsyms = KallsymsFinder(obtain_raw_kernel_from_file(kernel_bin.read()), args.bit_size)
  File "/home/itz63c/dumpyara/vmlinux-to-elf/vmlinux_to_elf/kallsyms_finder.py", line 188, in __init__
    self.apply_elf64_rela()
  File "/home/itz63c/dumpyara/vmlinux-to-elf/vmlinux_to_elf/kallsyms_finder.py", line 356, in apply_elf64_rela
    img[offset:offset+8] = pack('<Q', value)
struct.error: argument out of range

Kernel download: boot.img

4.9.65 vmlinux failed to process

I am trying to decompile Android device's stock ROM vmlinux and at first I wanted to use your script to improve decompilation quality. When I try to process vmlinux, I get 'ElfNullSection' object has no attribute 'symbol_table' exception.

After adding some code for debugging, I see that self.symtab_section (what is set to self.elf_file.sections[self.section_header.sh_link] value) is a ElfNullSection instead of ElfSection. That's all I got, I'm not good at Python, so I can't fix it by myself...

$ vmlinux-to-elf vmlinuxEng vmlinuxEng.elf
[+] Version string: Linux version 4.9.65+ (flyme@Mz-Builder-L23) (gcc version 4.9.x 20150123 (prerelease) (GCC) ) #1 SMP PREEMPT Wed Jul 25 17:45:44 CST 2018

[+] Guessed architecture: aarch64 successfully in 41.76 seconds
[+] Found relocations table at file offset 0x1dc38f0 (count=233684)
[+] Found kernel text candidate: 0xffffff8008080000
[+] Successfully applied 233684 relocations.
[+] Found kallsyms_token_table at file offset 0x01869e00
[+] Found kallsyms_token_index at file offset 0x0186a300
[+] Found kallsyms_markers at file offset 0x01868a00
[+] Found kallsyms_names at file offset 0x01699100
[+] Found kallsyms_num_syms at file offset 0x01699000
[i] Negative offsets overall: 0 %
[i] Null addresses overall: 0 %
[+] Found kallsyms_offsets at file offset 0x015ffffc
Traceback (most recent call last):
  File "/usr/bin/vmlinux-to-elf", line 63, in <module>
    ElfSymbolizer(
  File "/usr/lib/python3.10/site-packages/vmlinux_to_elf/elf_symbolizer.py", line 49, in __init__
    kernel = ElfFile.from_bytes(BytesIO(file_contents))
  File "/usr/lib/python3.10/site-packages/vmlinux_to_elf/utils/elf.py", line 166, in from_bytes
    obj.unserialize(data)
  File "/usr/lib/python3.10/site-packages/vmlinux_to_elf/utils/elf.py", line 186, in unserialize
    section.post_unserialize()
  File "/usr/lib/python3.10/site-packages/vmlinux_to_elf/utils/elf.py", line 928, in post_unserialize
    relocation.associated_symbol = self.symtab_section.symbol_table[relocation.r_info_sym]
AttributeError: 'ElfNullSection' object has no attribute 'symbol_table'

Feature: Arch as command line argument

I think it would be nice with a --arch argument for kallsyms-finder and vmlinux-to-elf to immediately skip the architecture guessing. Usually we already know the arch beforehand, so just supplying that would speed up extraction quite a bit, since this step can (at least on my machine) take around 15 seconds at times.

Problem with Android 10/11 Xiaomi device

I am facing assertion error with xiaomi k20pro

i printed a few values and those were

kallsyms_markers_entries = 8034917443174400
self.offset_table_element_size = 8
position = -8034917414015472

Link to kernel https://git.rip/dumps/xiaomi/raphael/-/blob/raphael-user-11-RKQ1.200826.002-20.12.28-release-keys/bootimg/kernel

  File "./kallsyms-finder", line 1116, in <module>
    kallsyms = KallsymsFinder(obtain_raw_kernel_from_file(kernel_bin.read()), args.bit_size)
  File "./kallsyms-finder", line 209, in __init__
    self.find_kallsyms_names()
  File "./kallsyms-finder", line 775, in find_kallsyms_names
    assert position > 0
AssertionError

Android 5.4 kernel error

Traceback (most recent call last):
  File "./main.py", line 67, in <module>
    args.base_address, args.file_offset
  File "/Users/weishu/dev/github/vmlinux-to-elf/vmlinux_to_elf/elf_symbolizer.py", line 44, in __init__
    kallsyms_finder = KallsymsFinder(file_contents, bit_size)
  File "/Users/weishu/dev/github/vmlinux-to-elf/vmlinux_to_elf/kallsyms_finder.py", line 216, in __init__
    self.parse_symbol_table()
  File "/Users/weishu/dev/github/vmlinux-to-elf/vmlinux_to_elf/kallsyms_finder.py", line 1060, in parse_symbol_table
    symbol.symbol_type = KallsymsSymbolType(symbol_name[0].upper())
  File "/Users/weishu/.pyenv/versions/3.7.3/lib/python3.7/enum.py", line 310, in __call__
    return cls.__new__(cls, value)
  File "/Users/weishu/.pyenv/versions/3.7.3/lib/python3.7/enum.py", line 564, in __new__
    raise exc
  File "/Users/weishu/.pyenv/versions/3.7.3/lib/python3.7/enum.py", line 548, in __new__
    result = cls._missing_(value)
  File "/Users/weishu/.pyenv/versions/3.7.3/lib/python3.7/enum.py", line 577, in _missing_
    raise ValueError("%r is not a valid %s" % (value, cls.__name__))
ValueError: '1' is not a valid KallsymsSymbolType

It seems that the symbal_name[0] is '1' in https://github.com/marin-m/vmlinux-to-elf/blob/master/vmlinux_to_elf/kallsyms_finder.py#L1060

I try to change the code to this:

            else:
                try:
                    symbol.symbol_type = KallsymsSymbolType(symbol_name[0].upper())
                    symbol.is_global = symbol_name[0].isupper()
                except:
                    logging.warn('Unknow symbol type: %s' % symbol_name[0])
                    continue

It can generate the elf file, but the elf seems not correct :(

The kernel: https://drive.google.com/file/d/1x-SMr699bW7pmpSDbNqeT6GVro5flqxw/view?usp=sharing

Any suggestion would be helpful, thank you!

Corrupt symbolization of Google Pixel 3 XL kernel factory image

First, I would like to thank you all for this amazing work.

Second, I would like to point out a bug (I think) I encountered. I tried running vmlinux-to-elf on a boot.img extracted from the Factory firmware downloaded from Google's image archive, and the generated elf file had offsets pointing at incorrect locations (but it does open as valid elf file in RE tools).

The exact image I used was for build number SP1A.210812.016.C1 for the Google Pixel 3 XL. You can download it from here (search for 67ea87fc in that page).

Reproduction

$ ./vmlinux-to-elf ./path/to/crosshatch-sp1a.210812.016.c1/boot.img ./path/to/symbolized/kernel.elf
[+] Kernel successfully decompressed in-memory (the offsets that follow will be given relative to the decompressed binary)
[+] Kernel successfully decompressed in-memory (the offsets that follow will be given relative to the decompressed binary)
[+] Version string: Linux version 4.9.270-g862f51bac900-ab7613625 (android-build@abfarm-east4-101) (Android (7284624, based on r416183b) clang version 12.0.5 (https://android.googlesource.com/toolchain/llvm-project c935d99d7cf2016289302412d708641d52d2f7ee)) #0 SMP PREEMPT Thu Aug 5 07:04:42 UTC 2021
[+] Guessed architecture: aarch64 successfully in 3.56 seconds
[+] Found relocations table at file offset 0x24ffc78 (count=245151)
[+] Found kernel text candidate: 0xffffff8008000000
WARNING! bad rela offset ffffff800af3e1b8
[+] Found kallsyms_token_table at file offset 0x01cb9c00
[+] Found kallsyms_token_index at file offset 0x01cba000
[+] Found kallsyms_markers at file offset 0x01cb8500
[+] Found kallsyms_names at file offset 0x01a37f00
[+] Found kallsyms_num_syms at file offset 0x01a37e00
[i] Negative offsets overall: 0 %
[i] Null addresses overall: 0 %
[+] Found kallsyms_offsets at file offset 0x01983ffc
[+] Successfully wrote the new ELF kernel to ./path/to/symbolized/kernel.elf

Actual Result

The binary has the symbols at invalid locations, which clearly visible through Ghidra's decompilation of most functions. For example, this is the binder_ioctl function:
image
image
image

The assembly code does not have any resemblance of the actual binder_ioctl code and there are no calls to any other binder functions.

Expected Result

The binary should have the symbols at valid locations. This is an example of a valid binder_ioctl decompilation or a valid elf generated using your tool (for another kernel):
image

Attempted Workarounds/Solutions

I tried:

  • Extracting the kernel from the boot.img with unpack_bootimg.py (from the android.googlesource.com repo) then running vmlinux-to-elf on that (same result)
  • Checking other function symbols (all the ones I checked look messed up)
  • Using Radare2 rather than Ghidra (still seems very messed up to me)

I would also like to note that using the kallsyms-finder script did extract the symbols' addresses correctly on that same boot.img.

ValueError: 2 candidates for kallsyms_token_table in kernel image

Hello, im trying to use vmlinux-to-elf on this image: https://drive.google.com/file/d/1D4tMA-gllrzNfHw6bw2DvDnIINpz6gEG/view?usp=sharing

but i get error:

  File "./vmlinux-to-elf", line 63, in <module>
    ElfSymbolizer(
  File "/home/henris/vmlinux-to-elf/vmlinux_to_elf/elf_symbolizer.py", line 44, in __init__
    kallsyms_finder = KallsymsFinder(file_contents, bit_size)
  File "/home/henris/vmlinux-to-elf/vmlinux_to_elf/kallsyms_finder.py", line 193, in __init__
    self.find_kallsyms_token_table()
  File "/home/henris/vmlinux-to-elf/vmlinux_to_elf/kallsyms_finder.py", line 423, in find_kallsyms_token_table
    raise ValueError('%d candidates for kallsyms_token_table in kernel image' % len(candidates_offsets))
ValueError: 2 candidates for kallsyms_token_table in kernel image

Could you help?

Lots of invalid instructions when analyzed by ghidra and radrare2

First of all great project ๐Ÿ‘ I can imagine this project will help tons of firmware researchers out there.

I've met a problem though.
I'm currently doing research on a network camera firmware. Although binwalk didn't really identified vmlinux.img in the firmware analysis, I managed to found the portion of raw binary that is supposed to be the kernel image for the camera.
The data portion can be successfully analyzed and convert into ELF file by your script. However when I tired to use Ghidra to analyze it, it produces a lot of "invalid instruction" error. (Same in radare2)

The camera is running on a MIPS processor and your script have no problem identifying it, so I'm not sure what the problem might be.
Other binaries from the same firmware file can be analyzed without problem when setting language as MIPS:LE:64:64-32addr:o32 in Ghidra.

The data portion I mentioned can be downloaded here : https://drive.google.com/file/d/15gWN5dsWeiSefHpzh9VzPfwiUEpg_GKL/view?usp=sharing

get kallsysms as txt

Is it possible provide any way to get kallsyms in format like cat /proc/kallsyms ?
I mean this

c0093440 t devkmsg_llseek
c009351c t devkmsg_read
c00939b0 t log_make_free_space
c0093ad8 t log_store

error on Aarch64 linux 5.8-RC5 kernel

When analysing Image.gz provided by Arch Linux:

[+] Version string: Linux version 5.8.0-rc5-1-ARCH (builduser@leming) (aarch64-unknown-linux-gnu-gcc (GCC) 9.3.0, GNU ld (GNU Binutils) 2.34) #1 SMP Sun Jul 12 20:12:51 MDT 2020
[+] Guessed architecture: aarch64 successfully in 0.00 seconds
[+] Found kallsyms_token_table at file offset 0x0159d840
Traceback (most recent call last):
  File "C:\Work\git\vmlinux-to-elf\vmlinux_to_elf\main.py", line 65, in <module>
    args.base_address, args.file_offset
  File "C:\Work\git\vmlinux-to-elf\vmlinux_to_elf\elf_symbolizer.py", line 44, in __init__
    kallsyms_finder = KallsymsFinder(file_contents, bit_size)
  File "C:\Work\git\vmlinux-to-elf\vmlinux_to_elf\kallsyms_finder.py", line 194, in __init__
    self.find_kallsyms_token_index()
  File "C:\Work\git\vmlinux-to-elf\vmlinux_to_elf\kallsyms_finder.py", line 473, in find_kallsyms_token_index
    raise ValueError('This structure is not a kallsyms_token_table')
ValueError: This structure is not a kallsyms_token_table

LZ4 legacy format not recognised

I've seen a zImage which makes use of LZ4 compression which doesn't decompress with this tool, however using unlz4 I can decompress it and then successfully use vmlinux-to-elf on resulting uncompressed image (unlz4 does claim to produce an error, however the image it produces looks complete and works fine).

From a bit of investigation it appears that the Kernel makes use the legacy LZ4 format. The magic for this is different to the LZ4 magic specified in vmlinux-to-elf source (and I guess might not also be support by the python LZ4 library?).

https://manpages.debian.org/testing/lz4/unlz4.1.en.html
See the -l option (it actually calls out the Linux kernel).

https://github.com/cockroachdb/c-lz4/blob/master/internal/programs/lz4io.c
This contains the magic as 0x184C2102

I can't provide the zImage but it is an ARM Linux 3.12.25 image

Could not find kallsyms_names for Ubuntu Kernel

I receive the following error when trying to use the project on Linux v6.5.0, which is Ubuntu's kernel.

[+] Kernel successfully decompressed in-memory (the offsets that follow will be given relative to the decompressed binary)
[+] Version string: Linux version 6.5.0-17-generic (buildd@lcy02-amd64-034) (x86_64-linux-gnu-gcc-13 (Ubuntu 13.2.0-4ubuntu3) 13.2.0, GNU ld (GNU Binutils for Ubuntu) 2.41) #17-Ubuntu SMP PREEMPT_DYNAMIC  (Ubuntu 6.5.0-17.17-generic 6.5.8)
[+] Guessed architecture: x86_64 successfully in 5.91 seconds
[+] Found kallsyms_token_table at file offset 0x01a27688
[+] Found kallsyms_token_index at file offset 0x01a27a00
[+] Found kallsyms_markers at file offset 0x01a26ab0
Traceback (most recent call last):
  File "/vmlinux-to-elf/./vmlinux-to-elf", line 63, in <module>
    ElfSymbolizer(
  File "/vmlinux-to-elf/vmlinux_to_elf/elf_symbolizer.py", line 44, in __init__
    kallsyms_finder = KallsymsFinder(file_contents, bit_size)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/vmlinux-to-elf/vmlinux_to_elf/kallsyms_finder.py", line 212, in __init__
    self.find_kallsyms_num_syms()
  File "/vmlinux-to-elf/vmlinux_to_elf/kallsyms_finder.py", line 774, in find_kallsyms_num_syms
    raise ValueError('Could not find kallsyms_names')
ValueError: Could not find kallsyms_names

error analyzing a boot.img

Error analyzing the attached boot.img

boot.gz

[+] Kernel successfully decompressed in-memory (the offsets that follow will be given relative to the decompressed binary)
[+] Version string: Linux version 4.9.117-g039ca999a0ec-dirty (build@i3-ri-14-use1a-b-87) (gcc version 6.3.1 20170404 (Linaro GCC 6.3-2017.05) ) #1 SMP PREEMPT Mon Aug 3 01:09:46 UTC 2020
[+] Guessed architecture: armle successfully in 8.58 seconds
[+] Found kallsyms_token_table at file offset 0x00d660e0
[+] Found kallsyms_token_index at file offset 0x00d66430
Traceback (most recent call last):
File "vmlinux-to-elf/vmlinux_to_elf/main.py", line 61, in <module>
ElfSymbolizer(
File "/media/android/ab5d60f6-9904-42cf-9ca2-c9af2dc4f3da/AC101BOX/androiddumps/dumpyara/vmlinux-to-elf/vmlinux_to_elf/elf_symbolizer.py", line 44, in __init__
kallsyms_finder = KallsymsFinder(file_contents, bit_size)
File "/media/android/ab5d60f6-9904-42cf-9ca2-c9af2dc4f3da/AC101BOX/androiddumps/dumpyara/vmlinux-to-elf/vmlinux_to_elf/kallsyms_finder.py", line 208, in __init__
self.find_kallsyms_markers()
File "/media/android/ab5d60f6-9904-42cf-9ca2-c9af2dc4f3da/AC101BOX/androiddumps/dumpyara/vmlinux-to-elf/vmlinux_to_elf/kallsyms_finder.py", line 697, in find_kallsyms_markers
raise ValueError('Could not guess the architecture register ' +
ValueError: Could not guess the architecture register size for kernel

Thanks

KALLYSYMS base relative heuristic decision incorrect for some boot images

I discovered yesterday that at least some boot images will not properly get kallsyms parsed into the vmlinux image due to CONFIG_KALLSYMS_BASE_RELATIVE being turned off in the kernel. While there appears to be some sort of minor heuristic detection based off kernel version, later kernel versions with this option turned off (for whatever reason) will not get parsed correctly - or at least, mine won't.

I fudged this by forcibly setting has_base_relative = False, but ideally there's a less hacky solution. Perhaps just exposing a new command-line option to the user to ask vmlinux_to_elf to assume one way or another, or perhaps a more aggressive heuristic (e.g. parsing config.gz and looking for the CONFIG flag) would be effective as well.

Here's an example boot image where this is currently failing.

Qualcomm zImage how to compress after patch it

Hello,

thanx for your great work! I convert with your tool a Qualcomm zImage to ELF its ( UNCOMPRESSED_IMG)

with append DTB) you know how it is compresssed? I cant find any information about the file format. after i patch it my target will not execute it.

Best regards

My experiences and workarounds for mt6762 4.9 kernel

Hi,

thank you for this great tool! In the end I was able to get it working with my boot image: boot.zip

However, I needed some hacks I at least wanted to document here. I am not knowledgeable enough to properly fix those issues, but these should give at least some hints.

First of all, I was affected by #54 (my boot.img has ikconfig and it shows that it is definitely not using relative base), so I used a quick & dirty patch to get my boot image to parse without negative offsets:

diff --git a/vmlinux_to_elf/kallsyms_finder.py b/vmlinux_to_elf/kallsyms_finder.py
index 5279192..8bbc3ae 100755
--- a/vmlinux_to_elf/kallsyms_finder.py
+++ b/vmlinux_to_elf/kallsyms_finder.py
@@ -853,7 +853,7 @@ class KallsymsFinder:
             and 'ia64' not in self.version_string.lower()
             and 'itanium' not in self.version_string.lower()):
             
-            likely_has_base_relative = True
+            likely_has_base_relative = False
         
         # Does the system seem to be 64-bits?

After that, I had a valid ELF file that Ghidra was importing just fine, however all symbols were garbled, some examples:
grafik
grafik

I was able to fix that by following a quick procedure:

  1. Call kallsyms-finder (with applied relative fix) and save the first line of actual output (ffffff8008080800 T do_undefinstr)
  2. Mask last 3 bits (idea from
    progbits.section_header.sh_addr = first_symbol_virtual_address & 0xfffffffffffff000
    )
  3. Run vmlinux-to-elf with that base address:
    ./vmlinux-to-elf --base-address 0xffffff8008080000 boot.img new2.elf
  4. Import resulting file into Ghidra

Now the symbols resolve properly:
grafik
Screenshot_20231116_193421

I hope this information aids you in resolving the issues in the script.

Regards,
Nick

ValueError for kernel 5.4

Hello,
I was trying to recover an ELF file with symbols from a stripped vmlinux. The vmlinux file is at https://drive.google.com/file/d/1crKV7X6wYkGTzUNYd2U9SivjftpIVNKN/view?usp=sharing
It is is extracted from vmlinuz using extract-vmlinux.sh from kernel tree.

The error message is as below:

 [+] Found kallsyms_offsets at file offset 0x01368860
Traceback (most recent call last):
  File "/usr/local/bin/vmlinux-to-elf", line 67, in <module>
    args.base_address, args.file_offset
  File "/usr/local/lib/python3.6/dist-packages/vmlinux_to_elf/elf_symbolizer.py", line 44, in __init__
    kallsyms_finder = KallsymsFinder(file_contents, bit_size)
  File "/usr/local/lib/python3.6/dist-packages/vmlinux_to_elf/kallsyms_finder.py", line 216, in __init__
    self.parse_symbol_table()
  File "/usr/local/lib/python3.6/dist-packages/vmlinux_to_elf/kallsyms_finder.py", line 1068, in parse_symbol_table
    symbol.symbol_type = KallsymsSymbolType(symbol_name[0].upper())
  File "/usr/lib/python3.6/enum.py", line 293, in __call__
    return cls.__new__(cls, value)
  File "/usr/lib/python3.6/enum.py", line 535, in __new__
    return cls._missing_(value)
  File "/usr/lib/python3.6/enum.py", line 548, in _missing_
    raise ValueError("%r is not a valid %s" % (value, cls.__name__))
ValueError: '_' is not a valid KallsymsSymbolType

Could you help to give some hints about the above Error?
Thanks!

kallsyms-finder output

The tool works as suggested, the output is printed on the screen, but in my case is more than 2000 functions, how can I save the output to a file to be able to actually use it?

parse __ksymtab/__ksymtab_gpl and create symbols

Kernels which do not have CONFIG_KALLSYMS but support loadable modules still have a symbol table for runtime linking of the modules. These are present as simple address/name pairs in sections __ksymtab, __ksymtab_gpl, __ksymtab_strings.

There is probably some simple heuristic that would allow to find them, e.g. in a few sample I have it seems "loops_per_jiffy" is the first string in __ksymtab_strings and the two tables immediately precede it.

BTW symbolizing stripped ELF kernels using these tables would be useful too.

Performance: ElfStrtab is slow

add_string_and_return_offset runs in O(n) time, since it needs to scan the entirety of raw_string_table every time. This makes adding all of the symbols run in O(nยฒ) time which on my machine takes around 30 seconds.

return_string_from_offset runs in O(n) time. It takes a slice of raw_string_table which will copy the entirety of raw_string_table from offset to a new bytes object.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.