Running whisper-cpp-server docker image in a kubernetes cluster as a microservice. I'm

here's my cpu info <div class="snippet-clipboard-content notranslate position-rela

Illegal instruction (core dumped) about whisper-cpp-server HOT 4 OPEN

Jonny-Burkholder commented on September 26, 2024

Illegal instruction (core dumped)

from whisper-cpp-server.

Comments (4)

litongjava commented on September 26, 2024 1

The issue with your container exiting and the "Illegal instruction (core dumped)" error you encountered is likely due to the CPUs in some nodes of your Kubernetes cluster not supporting the instruction set with which the Docker image was compiled. This situation usually occurs when the Docker image is built on a machine optimized for a specific CPU instruction set (such as AVX2), while some CPUs in the cluster do not support these instructions.

The model you are using is ggml-tiny.en-q5_1.bin. I have not tested this model. Please provide the address of this model so I can test it.You might want to try the following two commands. The model is already packaged in the container:

docker run -dit --name whisper-server -p 8080:8080 litongjava/whisper-cpp-server:1.0.0-base-en

docker run -dit --name whisper-server -p 8080:8080 litongjava/whisper-cpp-server:1.0.0-large-v3

I am not aware of the CPU architecture of your target platform. If it is an issue with the CPU architecture, you will need to recompile and build the Docker image on the target platform.

from whisper-cpp-server.

Jonny-Burkholder commented on September 26, 2024

You're right, that's exactly what it is. Thanks!

from whisper-cpp-server.

Jonny-Burkholder commented on September 26, 2024

The model is one of the ones available on HuggingFace from ggerganov. My kubernetes environment is pretty resource limited, and I'm planning to use this for voice commands, so I wanted to use the smallest model I could, then implement something like a Levenshtein distance algorithm to match to the commands in case of a mis-transcribed word.

Here's a link to the model:
https://huggingface.co/ggerganov/whisper.cpp/blob/main/ggml-tiny.en-q5_1.bin

from whisper-cpp-server.

litongjava commented on September 26, 2024

here's my cpu info

root@ping-Inspiron-3458:~/code/whisper-cpp-server# lscpu
Architecture:                       x86_64
CPU op-mode(s):                     32-bit, 64-bit
Byte Order:                         Little Endian
Address sizes:                      39 bits physical, 48 bits virtual
CPU(s):                             4
On-line CPU(s) list:                0-3
Thread(s) per core:                 2
Core(s) per socket:                 2
Socket(s):                          1
NUMA node(s):                       1
Vendor ID:                          GenuineIntel
CPU family:                         6
Model:                              69
Model name:                         Intel(R) Core(TM) i5-4210U CPU @ 1.70GHz
Stepping:                           1
CPU MHz:                            2394.294
CPU max MHz:                        2700.0000
CPU min MHz:                        800.0000
BogoMIPS:                           4788.58
Virtualization:                     VT-x
L1d cache:                          64 KiB
L1i cache:                          64 KiB
L2 cache:                           512 KiB
L3 cache:                           3 MiB
NUMA node0 CPU(s):                  0-3
Vulnerability Gather data sampling: Not affected
Vulnerability Itlb multihit:        KVM: Mitigation: VMX disabled
Vulnerability L1tf:                 Mitigation; PTE Inversion; VMX conditional cache flushes, SMT vulnerable
Vulnerability Mds:                  Mitigation; Clear CPU buffers; SMT vulnerable
Vulnerability Meltdown:             Mitigation; PTI
Vulnerability Mmio stale data:      Unknown: No mitigations
Vulnerability Retbleed:             Not affected
Vulnerability Spec rstack overflow: Not affected
Vulnerability Spec store bypass:    Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Vulnerability Spectre v1:           Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2:           Mitigation; Retpolines, IBPB conditional, IBRS_FW, STIBP conditional, RSB filling, PBRSB-eIBRS Not affected
Vulnerability Srbds:                Mitigation; Microcode
Vulnerability Tsx async abort:      Not affected
Flags:                              fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon
                                     pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 movbe popcnt t
                                    sc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm cpuid_fault epb invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adju
                                    st bmi1 avx2 smep bmi2 erms invpcid xsaveopt dtherm ida arat pln pts md_clear flush_l1d

I build a new docker image for you. it wroks well in my test environment.please test

docker run -dit --name whisper-server -p 8080:8080 litongjava/whisper-cpp-server:1.0.0-tiny.en-q5_1

root@ping-Inspiron-3458:~/code/whisper-cpp-server# docker logs -f 4df89f6238da
whisper_init_from_file_with_params_no_state: loading model from '/app/models/ggml-tiny.en-q5_1.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab       = 51864
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 384
whisper_model_load: n_audio_head  = 6
whisper_model_load: n_audio_layer = 4
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 384
whisper_model_load: n_text_head   = 6
whisper_model_load: n_text_layer  = 4
whisper_model_load: n_mels        = 80
whisper_model_load: ftype         = 9
whisper_model_load: qntvr         = 1
whisper_model_load: type          = 1 (tiny)
whisper_model_load: adding 1607 extra tokens
whisper_model_load: n_langs       = 99
whisper_model_load:      CPU total size =    31.57 MB
whisper_model_load: model size    =   31.57 MB
whisper_init_state: kv self size  =    8.26 MB
whisper_init_state: kv cross size =    9.22 MB
whisper_init_state: compute buffer (conv)   =   13.32 MB
whisper_init_state: compute buffer (encode) =   85.66 MB
whisper_init_state: compute buffer (cross)  =    4.01 MB
whisper_init_state: compute buffer (decode) =   96.02 MB

whisper service listening at http://0.0.0.0:8080

24-04-25 12:14:10.787: Received filename: jfk.wav 
24-04-25 12:14:10.787: audio_format:wav 
Successfully loaded jfk.wav

system_info: n_threads = 4 / 4 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | METAL = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | CUDA = 0 | COREML = 0 | OPENVINO = 0 | 

run: WARNING: model is not multilingual, ignoring language and translation options
run: processing 'jfk.wav' (176000 samples, 11.0 sec), 4 threads, 1 processors, lang = en, task = transcribe, timestamps = 1 ...

Running whisper.cpp inference on jfk.wav

[00:00:00.000 --> 00:00:07.740]   And so my fellow Americans ask not what your country can do for you
[00:00:07.740 --> 00:00:10.580]   ask what you can do for your country.

from whisper-cpp-server.

Illegal instruction (core dumped) about whisper-cpp-server HOT 4 OPEN

Comments (4)

Related Issues (5)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent