inferflow / inferflow Goto Github PK
View Code? Open in Web Editor NEWInferflow is an efficient and highly configurable inference engine for large language models (LLMs).
License: MIT License
Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).
License: MIT License
devices = 0&1&2&3;4&5&6&7
decoder_cpu_layer_count = 0
cpu_threads = 8
max_concurrent_queries = 6
return_output_tensors = true
;debug options
is_study_mode = false
show_tensors = false
Configuration = release; Platform = x64
========== ========== ========== ========== ========== ==========
Loading model specifications...
Loading model opt_13b...
vocab_size: 50272, embd_dims: 5120, decoder layers: 40, decoder heads: 40, decoder kv heads: 40
qkv_format 1 is not compatible with tensor parallelism
Failed to load the model
Failed to initialize the inference engine
Memory usage (MB): 203.21, 203.21 (Peak)
Press the enter key to quit...
Thanks.
Good job!
Hope to see comparisons with different frameworks on some models, such as throughputs, first token speed, etc.
Steps to reproduce:
Did I miss anything?
The bash file in step 1 requires wget, rather than installing wget, I download the files manually
I assume Step 2 is a type, as its asking to execute a configuration file (that doesn't exist)
As I've already downloaded the files from the example above, I'll choose to use that model (llama2.c)
I open the configuration file and I'm not sure what to edit?
I assume that I need to add a new entry here:
However, the naming convention is unclear.
Lets look at the entries that exist and see if we can match them to models in the model folder, here:
We can see the facebook_m2m200 exists (not yet downloaded) however the names do not match, i.e it is called facebook_m2m100_418m
Also bert is in the list, however its called bert_base_multilingual_cased
Therefore I conclude that this is not the entry that I need to edit.
Can you please clarity exactly what needs to be edited in this file?
I'm not sure what needs to be edited here either.
Can you clarify this too please?
Here's my attempt:
I'm assuming that I need to uncomment the models I downloaded earlier
I've tried with just one and with both
I also assume that I need set devices=1? to use the GPU ( I only built the GPU solution)
It doesn't look like I need to change anything in this file, so I keep it as it is
client log
`
; --------------------------------------------
#1; 2024-1-19 17:28:4; 0x8400(info_key); sslib::AppEnv::Init#109@app_environment.cc
Application environment is set successfullly.
#2; 2024-1-19 17:28:4; 0x8400(info_key); sslib::AppEnv::Init#110@app_environment.cc
app_dir: C:\Users\micha\source\repos\inferflow-main\bin\x64_Release
#3; 2024-1-19 17:28:4; 0x8400(info_key); sslib::AppEnv::Init#111@app_environment.cc
app_name: inferflow_client; Version: 0.1
#4; 2024-1-19 17:28:4; 0x8400(info_key); sslib::AppEnv::Init#116@app_environment.cc
config_dir: C:\Users\micha\source\repos\inferflow-main\bin
#5; 2024-1-19 17:28:4; 0x8400(info_key); sslib::AppEnv::Init#117@app_environment.cc
data_root_dir: C:\Users\micha\source\repos\inferflow-main\data/inferflow/
#6; 2024-1-19 17:28:4; 0x8400(info_key); sslib::AppEnv::Init#118@app_environment.cc
is_daemon: false
#7; 2024-1-19 17:28:4; 0x8400(info_key); sslib::AppEnv::Init#124@app_environment.cc
Configuration = release; Platform = x64
#8; 2024-1-19 17:28:4; 0x8400(info_key); sslib::AppEnv::Init#127@app_environment.cc
========== ========== ========== ========== ========== ==========
#9; 2024-1-19 17:28:4; 0x8400(info_key); Run#60@inferflow_client.cc
decoding_strategy: sample.std
#10; 2024-1-19 17:28:4; 0x8400(info_key); Run#61@inferflow_client.cc
query_random_seed: 1
#11; 2024-1-19 17:28:4; 0x8400(info_key); Run#62@inferflow_client.cc
temperature: 0.70
#12; 2024-1-19 17:28:4; 0x200(warning); sslib::HttpClient::ExecuteInner#316@http_client.cc
Connecting error (timeout: 100)
#13; 2024-1-19 17:28:4; 0x300(error); Run#85@inferflow_client.cc
Failed to process the request (error-code: 2)
#14; 2024-1-19 17:28:4; 0x8400(info_key); sslib::AppEnv::LogProcessMemoryUsage#445@app_environment.cc
Memory usage (MB): 10.06, 10.12 (Peak)
; --------------------------------------------
#1; 2024-1-19 17:29:33; 0x8400(info_key); sslib::AppEnv::Init#109@app_environment.cc
Application environment is set successfullly.
#2; 2024-1-19 17:29:33; 0x8400(info_key); sslib::AppEnv::Init#110@app_environment.cc
app_dir: C:\Users\micha\source\repos\inferflow-main\bin\x64_Release
#3; 2024-1-19 17:29:33; 0x8400(info_key); sslib::AppEnv::Init#111@app_environment.cc
app_name: inferflow_client; Version: 0.1
#4; 2024-1-19 17:29:33; 0x8400(info_key); sslib::AppEnv::Init#116@app_environment.cc
config_dir: C:\Users\micha\source\repos\inferflow-main\bin
#5; 2024-1-19 17:29:33; 0x8400(info_key); sslib::AppEnv::Init#117@app_environment.cc
data_root_dir: C:\Users\micha\source\repos\inferflow-main\data/inferflow/
#6; 2024-1-19 17:29:33; 0x8400(info_key); sslib::AppEnv::Init#118@app_environment.cc
is_daemon: false
#7; 2024-1-19 17:29:33; 0x8400(info_key); sslib::AppEnv::Init#124@app_environment.cc
Configuration = release; Platform = x64
#8; 2024-1-19 17:29:33; 0x8400(info_key); sslib::AppEnv::Init#127@app_environment.cc
========== ========== ========== ========== ========== ==========
#9; 2024-1-19 17:29:33; 0x8400(info_key); Run#60@inferflow_client.cc
decoding_strategy: sample.std
#10; 2024-1-19 17:29:33; 0x8400(info_key); Run#61@inferflow_client.cc
query_random_seed: 1
#11; 2024-1-19 17:29:33; 0x8400(info_key); Run#62@inferflow_client.cc
temperature: 0.70
#12; 2024-1-19 17:29:33; 0x200(warning); sslib::HttpClient::ExecuteInner#316@http_client.cc
Connecting error (timeout: 100)
#13; 2024-1-19 17:29:33; 0x300(error); Run#85@inferflow_client.cc
Failed to process the request (error-code: 2)
#14; 2024-1-19 17:29:33; 0x8400(info_key); sslib::AppEnv::LogProcessMemoryUsage#445@app_environment.cc
Memory usage (MB): 9.75, 9.82 (Peak)
; --------------------------------------------
#1; 2024-1-19 17:30:53; 0x8400(info_key); sslib::AppEnv::Init#109@app_environment.cc
Application environment is set successfullly.
#2; 2024-1-19 17:30:53; 0x8400(info_key); sslib::AppEnv::Init#110@app_environment.cc
app_dir: C:\Users\micha\source\repos\inferflow-main\bin\x64_Release
#3; 2024-1-19 17:30:53; 0x8400(info_key); sslib::AppEnv::Init#111@app_environment.cc
app_name: inferflow_client; Version: 0.1
#4; 2024-1-19 17:30:53; 0x8400(info_key); sslib::AppEnv::Init#116@app_environment.cc
config_dir: C:\Users\micha\source\repos\inferflow-main\bin
#5; 2024-1-19 17:30:53; 0x8400(info_key); sslib::AppEnv::Init#117@app_environment.cc
data_root_dir: C:\Users\micha\source\repos\inferflow-main\data/inferflow/
#6; 2024-1-19 17:30:53; 0x8400(info_key); sslib::AppEnv::Init#118@app_environment.cc
is_daemon: false
#7; 2024-1-19 17:30:53; 0x8400(info_key); sslib::AppEnv::Init#124@app_environment.cc
Configuration = release; Platform = x64
#8; 2024-1-19 17:30:53; 0x8400(info_key); sslib::AppEnv::Init#127@app_environment.cc
========== ========== ========== ========== ========== ==========
#9; 2024-1-19 17:30:53; 0x8400(info_key); Run#60@inferflow_client.cc
decoding_strategy: sample.std
#10; 2024-1-19 17:30:53; 0x8400(info_key); Run#61@inferflow_client.cc
query_random_seed: 1
#11; 2024-1-19 17:30:53; 0x8400(info_key); Run#62@inferflow_client.cc
temperature: 0.70
#12; 2024-1-19 17:30:53; 0x200(warning); sslib::HttpClient::ExecuteInner#316@http_client.cc
Connecting error (timeout: 100)
#13; 2024-1-19 17:30:53; 0x300(error); Run#85@inferflow_client.cc
Failed to process the request (error-code: 2)
#14; 2024-1-19 17:30:53; 0x8400(info_key); sslib::AppEnv::LogProcessMemoryUsage#445@app_environment.cc
Memory usage (MB): 9.77, 9.82 (Peak)
; --------------------------------------------
#1; 2024-1-19 18:1:10; 0x8400(info_key); sslib::AppEnv::Init#109@app_environment.cc
Application environment is set successfullly.
#2; 2024-1-19 18:1:10; 0x8400(info_key); sslib::AppEnv::Init#110@app_environment.cc
app_dir: C:\Users\micha\source\repos\inferflow-main\bin\x64_Release
#3; 2024-1-19 18:1:10; 0x8400(info_key); sslib::AppEnv::Init#111@app_environment.cc
app_name: inferflow_client; Version: 0.1
#4; 2024-1-19 18:1:10; 0x8400(info_key); sslib::AppEnv::Init#116@app_environment.cc
config_dir: C:\Users\micha\source\repos\inferflow-main\bin
#5; 2024-1-19 18:1:10; 0x8400(info_key); sslib::AppEnv::Init#117@app_environment.cc
data_root_dir: C:\Users\micha\source\repos\inferflow-main\data/inferflow/
#6; 2024-1-19 18:1:10; 0x8400(info_key); sslib::AppEnv::Init#118@app_environment.cc
is_daemon: false
#7; 2024-1-19 18:1:10; 0x8400(info_key); sslib::AppEnv::Init#124@app_environment.cc
Configuration = release; Platform = x64
#8; 2024-1-19 18:1:10; 0x8400(info_key); sslib::AppEnv::Init#127@app_environment.cc
========== ========== ========== ========== ========== ==========
#9; 2024-1-19 18:1:10; 0x8400(info_key); Run#60@inferflow_client.cc
decoding_strategy: sample.std
#10; 2024-1-19 18:1:10; 0x8400(info_key); Run#61@inferflow_client.cc
query_random_seed: 1
#11; 2024-1-19 18:1:10; 0x8400(info_key); Run#62@inferflow_client.cc
temperature: 0.70
#12; 2024-1-19 18:1:10; 0x200(warning); sslib::HttpClient::ExecuteInner#316@http_client.cc
Connecting error (timeout: 100)
#13; 2024-1-19 18:1:10; 0x300(error); Run#85@inferflow_client.cc
Failed to process the request (error-code: 2)
#14; 2024-1-19 18:1:10; 0x8400(info_key); sslib::AppEnv::LogProcessMemoryUsage#445@app_environment.cc
Memory usage (MB): 9.74, 9.80 (Peak)
; --------------------------------------------
#1; 2024-1-19 18:17:37; 0x8400(info_key); sslib::AppEnv::Init#109@app_environment.cc
Application environment is set successfullly.
#2; 2024-1-19 18:17:37; 0x8400(info_key); sslib::AppEnv::Init#110@app_environment.cc
app_dir: C:\Users\micha\source\repos\inferflow-main\bin\x64_Release
#3; 2024-1-19 18:17:37; 0x8400(info_key); sslib::AppEnv::Init#111@app_environment.cc
app_name: inferflow_client; Version: 0.1
#4; 2024-1-19 18:17:37; 0x8400(info_key); sslib::AppEnv::Init#116@app_environment.cc
config_dir: C:\Users\micha\source\repos\inferflow-main\bin
#5; 2024-1-19 18:17:37; 0x8400(info_key); sslib::AppEnv::Init#117@app_environment.cc
data_root_dir: C:\Users\micha\source\repos\inferflow-main\data/inferflow/
#6; 2024-1-19 18:17:37; 0x8400(info_key); sslib::AppEnv::Init#118@app_environment.cc
is_daemon: false
#7; 2024-1-19 18:17:37; 0x8400(info_key); sslib::AppEnv::Init#124@app_environment.cc
Configuration = release; Platform = x64
#8; 2024-1-19 18:17:37; 0x8400(info_key); sslib::AppEnv::Init#127@app_environment.cc
========== ========== ========== ========== ========== ==========
#9; 2024-1-19 18:17:37; 0x8400(info_key); Run#60@inferflow_client.cc
decoding_strategy: sample.std
#10; 2024-1-19 18:17:37; 0x8400(info_key); Run#61@inferflow_client.cc
query_random_seed: 1
#11; 2024-1-19 18:17:37; 0x8400(info_key); Run#62@inferflow_client.cc
temperature: 0.70
#12; 2024-1-19 18:17:37; 0x200(warning); sslib::HttpClient::ExecuteInner#316@http_client.cc
Connecting error (timeout: 100)
#13; 2024-1-19 18:17:37; 0x300(error); Run#85@inferflow_client.cc
Failed to process the request (error-code: 2)
#14; 2024-1-19 18:17:37; 0x8400(info_key); sslib::AppEnv::LogProcessMemoryUsage#445@app_environment.cc
Memory usage (MB): 10.33, 10.39 (Peak)
; --------------------------------------------
#1; 2024-1-19 18:17:55; 0x8400(info_key); sslib::AppEnv::Init#109@app_environment.cc
Application environment is set successfullly.
#2; 2024-1-19 18:17:55; 0x8400(info_key); sslib::AppEnv::Init#110@app_environment.cc
app_dir: C:\Users\micha\source\repos\inferflow-main\bin\x64_Release
#3; 2024-1-19 18:17:55; 0x8400(info_key); sslib::AppEnv::Init#111@app_environment.cc
app_name: inferflow_client; Version: 0.1
#4; 2024-1-19 18:17:55; 0x8400(info_key); sslib::AppEnv::Init#116@app_environment.cc
config_dir: C:\Users\micha\source\repos\inferflow-main\bin
#5; 2024-1-19 18:17:55; 0x8400(info_key); sslib::AppEnv::Init#117@app_environment.cc
data_root_dir: C:\Users\micha\source\repos\inferflow-main\data/inferflow/
#6; 2024-1-19 18:17:55; 0x8400(info_key); sslib::AppEnv::Init#118@app_environment.cc
is_daemon: false
#7; 2024-1-19 18:17:55; 0x8400(info_key); sslib::AppEnv::Init#124@app_environment.cc
Configuration = release; Platform = x64
#8; 2024-1-19 18:17:55; 0x8400(info_key); sslib::AppEnv::Init#127@app_environment.cc
========== ========== ========== ========== ========== ==========
#9; 2024-1-19 18:17:55; 0x8400(info_key); Run#60@inferflow_client.cc
decoding_strategy: sample.std
#10; 2024-1-19 18:17:55; 0x8400(info_key); Run#61@inferflow_client.cc
query_random_seed: 1
#11; 2024-1-19 18:17:55; 0x8400(info_key); Run#62@inferflow_client.cc
temperature: 0.70
#12; 2024-1-19 18:17:55; 0x200(warning); sslib::HttpClient::ExecuteInner#316@http_client.cc
Connecting error (timeout: 100)
#13; 2024-1-19 18:17:55; 0x300(error); Run#85@inferflow_client.cc
Failed to process the request (error-code: 2)
#14; 2024-1-19 18:17:55; 0x8400(info_key); sslib::AppEnv::LogProcessMemoryUsage#445@app_environment.cc
Memory usage (MB): 9.72, 9.79 (Peak)
; --------------------------------------------
#1; 2024-1-19 18:36:36; 0x8400(info_key); sslib::AppEnv::Init#109@app_environment.cc
Application environment is set successfullly.
#2; 2024-1-19 18:36:36; 0x8400(info_key); sslib::AppEnv::Init#110@app_environment.cc
app_dir: C:\Users\micha\source\repos\inferflow-main\bin\x64_Release
#3; 2024-1-19 18:36:36; 0x8400(info_key); sslib::AppEnv::Init#111@app_environment.cc
app_name: inferflow_client; Version: 0.1
#4; 2024-1-19 18:36:36; 0x8400(info_key); sslib::AppEnv::Init#116@app_environment.cc
config_dir: C:\Users\micha\source\repos\inferflow-main\bin
#5; 2024-1-19 18:36:36; 0x8400(info_key); sslib::AppEnv::Init#117@app_environment.cc
data_root_dir: C:\Users\micha\source\repos\inferflow-main\data/inferflow/
#6; 2024-1-19 18:36:36; 0x8400(info_key); sslib::AppEnv::Init#118@app_environment.cc
is_daemon: false
#7; 2024-1-19 18:36:36; 0x8400(info_key); sslib::AppEnv::Init#124@app_environment.cc
Configuration = release; Platform = x64
#8; 2024-1-19 18:36:36; 0x8400(info_key); sslib::AppEnv::Init#127@app_environment.cc
========== ========== ========== ========== ========== ==========
#9; 2024-1-19 18:36:36; 0x8400(info_key); Run#60@inferflow_client.cc
decoding_strategy: sample.std
#10; 2024-1-19 18:36:36; 0x8400(info_key); Run#61@inferflow_client.cc
query_random_seed: 1
#11; 2024-1-19 18:36:36; 0x8400(info_key); Run#62@inferflow_client.cc
temperature: 0.70
#12; 2024-1-19 18:36:36; 0x200(warning); sslib::HttpClient::ExecuteInner#316@http_client.cc
Connecting error (timeout: 100)
#13; 2024-1-19 18:36:36; 0x300(error); Run#85@inferflow_client.cc
Failed to process the request (error-code: 2)
#14; 2024-1-19 18:36:36; 0x8400(info_key); sslib::AppEnv::LogProcessMemoryUsage#445@app_environment.cc
Memory usage (MB): 9.75, 9.81 (Peak)
; --------------------------------------------
#1; 2024-1-19 18:47:2; 0x8400(info_key); sslib::AppEnv::Init#109@app_environment.cc
Application environment is set successfullly.
#2; 2024-1-19 18:47:2; 0x8400(info_key); sslib::AppEnv::Init#110@app_environment.cc
app_dir: C:\Users\micha\source\repos\inferflow-main\bin\x64_Release
#3; 2024-1-19 18:47:2; 0x8400(info_key); sslib::AppEnv::Init#111@app_environment.cc
app_name: inferflow_client; Version: 0.1
#4; 2024-1-19 18:47:2; 0x8400(info_key); sslib::AppEnv::Init#116@app_environment.cc
config_dir: C:\Users\micha\source\repos\inferflow-main\bin
#5; 2024-1-19 18:47:2; 0x8400(info_key); sslib::AppEnv::Init#117@app_environment.cc
data_root_dir: C:\Users\micha\source\repos\inferflow-main\data/inferflow/
#6; 2024-1-19 18:47:2; 0x8400(info_key); sslib::AppEnv::Init#118@app_environment.cc
is_daemon: false
#7; 2024-1-19 18:47:2; 0x8400(info_key); sslib::AppEnv::Init#124@app_environment.cc
Configuration = release; Platform = x64
#8; 2024-1-19 18:47:2; 0x8400(info_key); sslib::AppEnv::Init#127@app_environment.cc
========== ========== ========== ========== ========== ==========
#9; 2024-1-19 18:47:2; 0x8400(info_key); Run#60@inferflow_client.cc
decoding_strategy: sample.std
#10; 2024-1-19 18:47:2; 0x8400(info_key); Run#61@inferflow_client.cc
query_random_seed: 1
#11; 2024-1-19 18:47:2; 0x8400(info_key); Run#62@inferflow_client.cc
temperature: 0.70
#12; 2024-1-19 18:47:2; 0x200(warning); sslib::HttpClient::ExecuteInner#316@http_client.cc
Connecting error (timeout: 100)
#13; 2024-1-19 18:47:2; 0x300(error); Run#85@inferflow_client.cc
Failed to process the request (error-code: 2)
#14; 2024-1-19 18:47:2; 0x8400(info_key); sslib::AppEnv::LogProcessMemoryUsage#445@app_environment.cc
Memory usage (MB): 9.75, 9.81 (Peak)
`
service log
`
; --------------------------------------------
#1; 2024-1-19 17:27:3; 0x8400(info_key); sslib::AppEnv::Init#109@app_environment.cc
Application environment is set successfullly.
#2; 2024-1-19 17:27:3; 0x8400(info_key); sslib::AppEnv::Init#110@app_environment.cc
app_dir: C:\Users\micha\source\repos\inferflow-main\bin\x64_Release
#3; 2024-1-19 17:27:3; 0x8400(info_key); sslib::AppEnv::Init#111@app_environment.cc
app_name: inferflow_service; Version: 0.1.0
#4; 2024-1-19 17:27:3; 0x8400(info_key); sslib::AppEnv::Init#116@app_environment.cc
config_dir: C:\Users\micha\source\repos\inferflow-main\bin
#5; 2024-1-19 17:27:3; 0x8400(info_key); sslib::AppEnv::Init#117@app_environment.cc
data_root_dir: C:\Users\micha\source\repos\inferflow-main\data/inferflow/
#6; 2024-1-19 17:27:3; 0x8400(info_key); sslib::AppEnv::Init#118@app_environment.cc
is_daemon: false
#7; 2024-1-19 17:27:3; 0x8400(info_key); sslib::AppEnv::Init#124@app_environment.cc
Configuration = release; Platform = x64
#8; 2024-1-19 17:27:3; 0x8400(info_key); sslib::AppEnv::Init#127@app_environment.cc
========== ========== ========== ========== ========== ==========
#9; 2024-1-19 17:27:3; 0x8400(info_key); Run#13@inferflow_service_main.cc
Initializing the Inferflow service...
#10; 2024-1-19 17:27:3; 0x8400(info_key); inferflow::transformer::InferenceEngine::LoadConfig#1450@inference_engine.cc
Loading model specifications...
; --------------------------------------------
#1; 2024-1-19 17:29:17; 0x8400(info_key); sslib::AppEnv::Init#109@app_environment.cc
Application environment is set successfullly.
#2; 2024-1-19 17:29:17; 0x8400(info_key); sslib::AppEnv::Init#110@app_environment.cc
app_dir: C:\Users\micha\source\repos\inferflow-main\bin\x64_Release
#3; 2024-1-19 17:29:17; 0x8400(info_key); sslib::AppEnv::Init#111@app_environment.cc
app_name: inferflow_service; Version: 0.1.0
#4; 2024-1-19 17:29:17; 0x8400(info_key); sslib::AppEnv::Init#116@app_environment.cc
config_dir: C:\Users\micha\source\repos\inferflow-main\bin
#5; 2024-1-19 17:29:17; 0x8400(info_key); sslib::AppEnv::Init#117@app_environment.cc
data_root_dir: C:\Users\micha\source\repos\inferflow-main\data/inferflow/
#6; 2024-1-19 17:29:17; 0x8400(info_key); sslib::AppEnv::Init#118@app_environment.cc
is_daemon: false
#7; 2024-1-19 17:29:17; 0x8400(info_key); sslib::AppEnv::Init#124@app_environment.cc
Configuration = release; Platform = x64
#8; 2024-1-19 17:29:17; 0x8400(info_key); sslib::AppEnv::Init#127@app_environment.cc
========== ========== ========== ========== ========== ==========
#9; 2024-1-19 17:29:17; 0x8400(info_key); Run#13@inferflow_service_main.cc
Initializing the Inferflow service...
#10; 2024-1-19 17:29:17; 0x8400(info_key); inferflow::transformer::InferenceEngine::LoadConfig#1450@inference_engine.cc
Loading model specifications...
; --------------------------------------------
#1; 2024-1-19 17:30:41; 0x8400(info_key); sslib::AppEnv::Init#109@app_environment.cc
Application environment is set successfullly.
#2; 2024-1-19 17:30:41; 0x8400(info_key); sslib::AppEnv::Init#110@app_environment.cc
app_dir: C:\Users\micha\source\repos\inferflow-main\bin\x64_Release
#3; 2024-1-19 17:30:41; 0x8400(info_key); sslib::AppEnv::Init#111@app_environment.cc
app_name: inferflow_service; Version: 0.1.0
#4; 2024-1-19 17:30:41; 0x8400(info_key); sslib::AppEnv::Init#116@app_environment.cc
config_dir: C:\Users\micha\source\repos\inferflow-main\bin
#5; 2024-1-19 17:30:41; 0x8400(info_key); sslib::AppEnv::Init#117@app_environment.cc
data_root_dir: C:\Users\micha\source\repos\inferflow-main\data/inferflow/
#6; 2024-1-19 17:30:41; 0x8400(info_key); sslib::AppEnv::Init#118@app_environment.cc
is_daemon: false
#7; 2024-1-19 17:30:41; 0x8400(info_key); sslib::AppEnv::Init#124@app_environment.cc
Configuration = release; Platform = x64
#8; 2024-1-19 17:30:41; 0x8400(info_key); sslib::AppEnv::Init#127@app_environment.cc
========== ========== ========== ========== ========== ==========
#9; 2024-1-19 17:30:41; 0x8400(info_key); Run#13@inferflow_service_main.cc
Initializing the Inferflow service...
#10; 2024-1-19 17:30:41; 0x8400(info_key); inferflow::transformer::InferenceEngine::LoadConfig#1450@inference_engine.cc
Loading model specifications...
; --------------------------------------------
#1; 2024-1-19 17:59:42; 0x8400(info_key); sslib::AppEnv::Init#109@app_environment.cc
Application environment is set successfullly.
#2; 2024-1-19 17:59:42; 0x8400(info_key); sslib::AppEnv::Init#110@app_environment.cc
app_dir: C:\Users\micha\source\repos\inferflow-main\bin\x64_Release
#3; 2024-1-19 17:59:42; 0x8400(info_key); sslib::AppEnv::Init#111@app_environment.cc
app_name: inferflow_service; Version: 0.1.0
#4; 2024-1-19 17:59:42; 0x8400(info_key); sslib::AppEnv::Init#116@app_environment.cc
config_dir: C:\Users\micha\source\repos\inferflow-main\bin
#5; 2024-1-19 17:59:42; 0x8400(info_key); sslib::AppEnv::Init#117@app_environment.cc
data_root_dir: C:\Users\micha\source\repos\inferflow-main\data/inferflow/
#6; 2024-1-19 17:59:42; 0x8400(info_key); sslib::AppEnv::Init#118@app_environment.cc
is_daemon: false
#7; 2024-1-19 17:59:42; 0x8400(info_key); sslib::AppEnv::Init#124@app_environment.cc
Configuration = release; Platform = x64
#8; 2024-1-19 17:59:42; 0x8400(info_key); sslib::AppEnv::Init#127@app_environment.cc
========== ========== ========== ========== ========== ==========
#9; 2024-1-19 17:59:42; 0x8400(info_key); Run#13@inferflow_service_main.cc
Initializing the Inferflow service...
#10; 2024-1-19 17:59:42; 0x8400(info_key); inferflow::transformer::InferenceEngine::LoadConfig#1450@inference_engine.cc
Loading model specifications...
; --------------------------------------------
#1; 2024-1-19 18:17:49; 0x8400(info_key); sslib::AppEnv::Init#109@app_environment.cc
Application environment is set successfullly.
#2; 2024-1-19 18:17:49; 0x8400(info_key); sslib::AppEnv::Init#110@app_environment.cc
app_dir: C:\Users\micha\source\repos\inferflow-main\bin\x64_Release
#3; 2024-1-19 18:17:49; 0x8400(info_key); sslib::AppEnv::Init#111@app_environment.cc
app_name: inferflow_service; Version: 0.1.0
#4; 2024-1-19 18:17:49; 0x8400(info_key); sslib::AppEnv::Init#116@app_environment.cc
config_dir: C:\Users\micha\source\repos\inferflow-main\bin
#5; 2024-1-19 18:17:49; 0x8400(info_key); sslib::AppEnv::Init#117@app_environment.cc
data_root_dir: C:\Users\micha\source\repos\inferflow-main\data/inferflow/
#6; 2024-1-19 18:17:49; 0x8400(info_key); sslib::AppEnv::Init#118@app_environment.cc
is_daemon: false
#7; 2024-1-19 18:17:49; 0x8400(info_key); sslib::AppEnv::Init#124@app_environment.cc
Configuration = release; Platform = x64
#8; 2024-1-19 18:17:49; 0x8400(info_key); sslib::AppEnv::Init#127@app_environment.cc
========== ========== ========== ========== ========== ==========
#9; 2024-1-19 18:17:49; 0x8400(info_key); Run#13@inferflow_service_main.cc
Initializing the Inferflow service...
#10; 2024-1-19 18:17:49; 0x8400(info_key); inferflow::transformer::InferenceEngine::LoadConfig#1450@inference_engine.cc
Loading model specifications...
; --------------------------------------------
#1; 2024-1-19 18:36:33; 0x8400(info_key); sslib::AppEnv::Init#109@app_environment.cc
Application environment is set successfullly.
#2; 2024-1-19 18:36:33; 0x8400(info_key); sslib::AppEnv::Init#110@app_environment.cc
app_dir: C:\Users\micha\source\repos\inferflow-main\bin\x64_Release
#3; 2024-1-19 18:36:33; 0x8400(info_key); sslib::AppEnv::Init#111@app_environment.cc
app_name: inferflow_service; Version: 0.1.0
#4; 2024-1-19 18:36:33; 0x8400(info_key); sslib::AppEnv::Init#116@app_environment.cc
config_dir: C:\Users\micha\source\repos\inferflow-main\bin
#5; 2024-1-19 18:36:33; 0x8400(info_key); sslib::AppEnv::Init#117@app_environment.cc
data_root_dir: C:\Users\micha\source\repos\inferflow-main\data/inferflow/
#6; 2024-1-19 18:36:33; 0x8400(info_key); sslib::AppEnv::Init#118@app_environment.cc
is_daemon: false
#7; 2024-1-19 18:36:33; 0x8400(info_key); sslib::AppEnv::Init#124@app_environment.cc
Configuration = release; Platform = x64
#8; 2024-1-19 18:36:33; 0x8400(info_key); sslib::AppEnv::Init#127@app_environment.cc
========== ========== ========== ========== ========== ==========
#9; 2024-1-19 18:36:33; 0x8400(info_key); Run#13@inferflow_service_main.cc
Initializing the Inferflow service...
#10; 2024-1-19 18:36:33; 0x8400(info_key); inferflow::transformer::InferenceEngine::LoadConfig#1450@inference_engine.cc
Loading model specifications...
; --------------------------------------------
#1; 2024-1-19 18:43:43; 0x8400(info_key); sslib::AppEnv::Init#109@app_environment.cc
Application environment is set successfullly.
#2; 2024-1-19 18:43:43; 0x8400(info_key); sslib::AppEnv::Init#110@app_environment.cc
app_dir: C:\Users\micha\source\repos\inferflow-main\bin\x64_Release
#3; 2024-1-19 18:43:43; 0x8400(info_key); sslib::AppEnv::Init#111@app_environment.cc
app_name: inferflow_service; Version: 0.1.0
#4; 2024-1-19 18:43:43; 0x8400(info_key); sslib::AppEnv::Init#116@app_environment.cc
config_dir: C:\Users\micha\source\repos\inferflow-main\bin
#5; 2024-1-19 18:43:43; 0x8400(info_key); sslib::AppEnv::Init#117@app_environment.cc
data_root_dir: C:\Users\micha\source\repos\inferflow-main\data/inferflow/
#6; 2024-1-19 18:43:43; 0x8400(info_key); sslib::AppEnv::Init#118@app_environment.cc
is_daemon: false
#7; 2024-1-19 18:43:43; 0x8400(info_key); sslib::AppEnv::Init#124@app_environment.cc
Configuration = release; Platform = x64
#8; 2024-1-19 18:43:43; 0x8400(info_key); sslib::AppEnv::Init#127@app_environment.cc
========== ========== ========== ========== ========== ==========
#9; 2024-1-19 18:43:43; 0x8400(info_key); Run#13@inferflow_service_main.cc
Initializing the Inferflow service...
#10; 2024-1-19 18:43:43; 0x8400(info_key); inferflow::transformer::InferenceEngine::LoadConfig#1450@inference_engine.cc
Loading model specifications...
`
inference log
`
; --------------------------------------------
#1; 2024-1-19 17:19:14; 0x8400(info_key); sslib::AppEnv::Init#109@app_environment.cc
Application environment is set successfullly.
#2; 2024-1-19 17:19:14; 0x8400(info_key); sslib::AppEnv::Init#110@app_environment.cc
app_dir: C:\Users\micha\source\repos\inferflow-main\bin\x64_Release
#3; 2024-1-19 17:19:14; 0x8400(info_key); sslib::AppEnv::Init#111@app_environment.cc
app_name: llm_inference; Version: 0.1.0
#4; 2024-1-19 17:19:14; 0x8400(info_key); sslib::AppEnv::Init#116@app_environment.cc
config_dir: C:\Users\micha\source\repos\inferflow-main\bin
#5; 2024-1-19 17:19:14; 0x8400(info_key); sslib::AppEnv::Init#117@app_environment.cc
data_root_dir: C:\Users\micha\source\repos\inferflow-main\data/inferflow/
#6; 2024-1-19 17:19:14; 0x8400(info_key); sslib::AppEnv::Init#118@app_environment.cc
is_daemon: false
#7; 2024-1-19 17:19:14; 0x8400(info_key); sslib::AppEnv::Init#124@app_environment.cc
Configuration = release; Platform = x64
#8; 2024-1-19 17:19:14; 0x8400(info_key); sslib::AppEnv::Init#127@app_environment.cc
========== ========== ========== ========== ========== ==========
; --------------------------------------------
#1; 2024-1-19 17:29:1; 0x8400(info_key); sslib::AppEnv::Init#109@app_environment.cc
Application environment is set successfullly.
#2; 2024-1-19 17:29:1; 0x8400(info_key); sslib::AppEnv::Init#110@app_environment.cc
app_dir: C:\Users\micha\source\repos\inferflow-main\bin\x64_Release
#3; 2024-1-19 17:29:1; 0x8400(info_key); sslib::AppEnv::Init#111@app_environment.cc
app_name: llm_inference; Version: 0.1.0
#4; 2024-1-19 17:29:1; 0x8400(info_key); sslib::AppEnv::Init#116@app_environment.cc
config_dir: C:\Users\micha\source\repos\inferflow-main\bin
#5; 2024-1-19 17:29:1; 0x8400(info_key); sslib::AppEnv::Init#117@app_environment.cc
data_root_dir: C:\Users\micha\source\repos\inferflow-main\data/inferflow/
#6; 2024-1-19 17:29:1; 0x8400(info_key); sslib::AppEnv::Init#118@app_environment.cc
is_daemon: false
#7; 2024-1-19 17:29:1; 0x8400(info_key); sslib::AppEnv::Init#124@app_environment.cc
Configuration = release; Platform = x64
#8; 2024-1-19 17:29:1; 0x8400(info_key); sslib::AppEnv::Init#127@app_environment.cc
========== ========== ========== ========== ========== ==========
; --------------------------------------------
#1; 2024-1-19 17:30:33; 0x8400(info_key); sslib::AppEnv::Init#109@app_environment.cc
Application environment is set successfullly.
#2; 2024-1-19 17:30:33; 0x8400(info_key); sslib::AppEnv::Init#110@app_environment.cc
app_dir: C:\Users\micha\source\repos\inferflow-main\bin\x64_Release
#3; 2024-1-19 17:30:33; 0x8400(info_key); sslib::AppEnv::Init#111@app_environment.cc
app_name: llm_inference; Version: 0.1.0
#4; 2024-1-19 17:30:33; 0x8400(info_key); sslib::AppEnv::Init#116@app_environment.cc
config_dir: C:\Users\micha\source\repos\inferflow-main\bin
#5; 2024-1-19 17:30:33; 0x8400(info_key); sslib::AppEnv::Init#117@app_environment.cc
data_root_dir: C:\Users\micha\source\repos\inferflow-main\data/inferflow/
#6; 2024-1-19 17:30:33; 0x8400(info_key); sslib::AppEnv::Init#118@app_environment.cc
is_daemon: false
#7; 2024-1-19 17:30:33; 0x8400(info_key); sslib::AppEnv::Init#124@app_environment.cc
Configuration = release; Platform = x64
#8; 2024-1-19 17:30:33; 0x8400(info_key); sslib::AppEnv::Init#127@app_environment.cc
========== ========== ========== ========== ========== ==========
; --------------------------------------------
#1; 2024-1-19 17:52:30; 0x8400(info_key); sslib::AppEnv::Init#109@app_environment.cc
Application environment is set successfullly.
#2; 2024-1-19 17:52:30; 0x8400(info_key); sslib::AppEnv::Init#110@app_environment.cc
app_dir: C:\Users\micha\source\repos\inferflow-main\bin\x64_Release
#3; 2024-1-19 17:52:30; 0x8400(info_key); sslib::AppEnv::Init#111@app_environment.cc
app_name: llm_inference; Version: 0.1.0
#4; 2024-1-19 17:52:30; 0x8400(info_key); sslib::AppEnv::Init#116@app_environment.cc
config_dir: C:\Users\micha\source\repos\inferflow-main\bin
#5; 2024-1-19 17:52:30; 0x8400(info_key); sslib::AppEnv::Init#117@app_environment.cc
data_root_dir: C:\Users\micha\source\repos\inferflow-main\data/inferflow/
#6; 2024-1-19 17:52:30; 0x8400(info_key); sslib::AppEnv::Init#118@app_environment.cc
is_daemon: false
#7; 2024-1-19 17:52:30; 0x8400(info_key); sslib::AppEnv::Init#124@app_environment.cc
Configuration = release; Platform = x64
#8; 2024-1-19 17:52:30; 0x8400(info_key); sslib::AppEnv::Init#127@app_environment.cc
========== ========== ========== ========== ========== ==========
; --------------------------------------------
#1; 2024-1-19 18:17:43; 0x8400(info_key); sslib::AppEnv::Init#109@app_environment.cc
Application environment is set successfullly.
#2; 2024-1-19 18:17:43; 0x8400(info_key); sslib::AppEnv::Init#110@app_environment.cc
app_dir: C:\Users\micha\source\repos\inferflow-main\bin\x64_Release
#3; 2024-1-19 18:17:43; 0x8400(info_key); sslib::AppEnv::Init#111@app_environment.cc
app_name: llm_inference; Version: 0.1.0
#4; 2024-1-19 18:17:43; 0x8400(info_key); sslib::AppEnv::Init#116@app_environment.cc
config_dir: C:\Users\micha\source\repos\inferflow-main\bin
#5; 2024-1-19 18:17:43; 0x8400(info_key); sslib::AppEnv::Init#117@app_environment.cc
data_root_dir: C:\Users\micha\source\repos\inferflow-main\data/inferflow/
#6; 2024-1-19 18:17:43; 0x8400(info_key); sslib::AppEnv::Init#118@app_environment.cc
is_daemon: false
#7; 2024-1-19 18:17:43; 0x8400(info_key); sslib::AppEnv::Init#124@app_environment.cc
Configuration = release; Platform = x64
#8; 2024-1-19 18:17:43; 0x8400(info_key); sslib::AppEnv::Init#127@app_environment.cc
========== ========== ========== ========== ========== ==========
; --------------------------------------------
#1; 2024-1-19 18:32:58; 0x8400(info_key); sslib::AppEnv::Init#109@app_environment.cc
Application environment is set successfullly.
#2; 2024-1-19 18:32:58; 0x8400(info_key); sslib::AppEnv::Init#110@app_environment.cc
app_dir: C:\Users\micha\source\repos\inferflow-main\bin\x64_Release
#3; 2024-1-19 18:32:58; 0x8400(info_key); sslib::AppEnv::Init#111@app_environment.cc
app_name: llm_inference; Version: 0.1.0
#4; 2024-1-19 18:32:58; 0x8400(info_key); sslib::AppEnv::Init#116@app_environment.cc
config_dir: C:\Users\micha\source\repos\inferflow-main\bin
#5; 2024-1-19 18:32:58; 0x8400(info_key); sslib::AppEnv::Init#117@app_environment.cc
data_root_dir: C:\Users\micha\source\repos\inferflow-main\data/inferflow/
#6; 2024-1-19 18:32:58; 0x8400(info_key); sslib::AppEnv::Init#118@app_environment.cc
is_daemon: false
#7; 2024-1-19 18:32:58; 0x8400(info_key); sslib::AppEnv::Init#124@app_environment.cc
Configuration = release; Platform = x64
#8; 2024-1-19 18:32:58; 0x8400(info_key); sslib::AppEnv::Init#127@app_environment.cc
========== ========== ========== ========== ========== ==========
; --------------------------------------------
#1; 2024-1-19 18:36:28; 0x8400(info_key); sslib::AppEnv::Init#109@app_environment.cc
Application environment is set successfullly.
#2; 2024-1-19 18:36:28; 0x8400(info_key); sslib::AppEnv::Init#110@app_environment.cc
app_dir: C:\Users\micha\source\repos\inferflow-main\bin\x64_Release
#3; 2024-1-19 18:36:28; 0x8400(info_key); sslib::AppEnv::Init#111@app_environment.cc
app_name: llm_inference; Version: 0.1.0
#4; 2024-1-19 18:36:28; 0x8400(info_key); sslib::AppEnv::Init#116@app_environment.cc
config_dir: C:\Users\micha\source\repos\inferflow-main\bin
#5; 2024-1-19 18:36:28; 0x8400(info_key); sslib::AppEnv::Init#117@app_environment.cc
data_root_dir: C:\Users\micha\source\repos\inferflow-main\data/inferflow/
#6; 2024-1-19 18:36:28; 0x8400(info_key); sslib::AppEnv::Init#118@app_environment.cc
is_daemon: false
#7; 2024-1-19 18:36:28; 0x8400(info_key); sslib::AppEnv::Init#124@app_environment.cc
Configuration = release; Platform = x64
#8; 2024-1-19 18:36:28; 0x8400(info_key); sslib::AppEnv::Init#127@app_environment.cc
========== ========== ========== ========== ========== ==========
`
To summarise, I had to join the dots as the instructions were unclear, everything looked to work fine, until I ran the client and got an error. Logs are included
As a sidenote, The error seems to imply that it failed after waiting only 100ms? That could probably be increased.
Model:Baichuan2 7B Chat
Input:tell me a 50 word story
Inferflow Service Log:
decoding_alg: , strategy_id: 4, temperature: 0.00
Encoder input text:
Decoder input text:
{<reserved_106>}
tell me a 50 word story{<reserved_107>}
query_id: 1, output_len: 165, is_end: true
<reserved_106>
and <reserved_107>
should not be surrounded with {}
,but in file inferlow_service.ini
they are surrounded with {}
. Should I just delete {}
from inferlow_service.ini
or it is expected remove it in the c++ code ?
baichuan2 prompt template is: \n\n<reserved_106>\n{query}<reserved_107>\n
Hi,
Is there a plan to support deepseek-moe-16b-chat model?
Get the following output when input tokens is 700+ which can get whole output in vllm.
{
"ret_code": "succ",
"time_cost": 1.03,
"text": "[{\""
}
inferflow servcie log for this request:
query_id: 13, output_len: 729, is_end: true
Which param should I modify ?
We are using OpenAI REST API with vLLM. Will inferflow support OpenAI REST api later? So we can migrate to inferflow easier if inferflow is more performance than vLLM.
By the way, have yo do some performance comparison with vLLM or other LLM Inference Engine?
Step 2) Open interflow.sln with Visual Studio 2022
Step 3) Visual Studio asks to retarget, accept the defaults and click OK
Step 4) Switch to release configuration
Step 6) Notice the errors, build fails
Step 7) Notice the build folder is created, some items get built, the most of it has failed
The projects wont load, if i try to reload them:
I get the following error
The imported project "C:\Program Files\Microsoft Visual Studio\2022\Community\MSBuild\Microsoft\VC\v170\BuildCustomizations\CUDA 12.2.props" was not found. Confirm that the expression in the Import declaration "C:\Program Files\Microsoft Visual Studio\2022\Community\MSBuild\Microsoft\VC\v170\\BuildCustomizations\CUDA 12.2.props" is correct, and that the file exists on disk. C:\Users\michael\Documents\GitHub\inferflow\build\vs_projects\inferflow\inferflow_common.vcxproj
If i navigate to that folder, the file it's looking for does not exist
when will the orionstar LLM such as OrionStarAI/Orion-14B-LongChat to be supported?
Hybrid inference looks like a game changer. I want test it on Qwen. But how to make the model_spec.json for Qwen 1.5?
Thanks.
As the Whisper model is encoder-decoder structure, do you have any example of using inferflow?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.