Configuration Parsing Warning: Invalid JSON for config file config.json

InternVL3_5-1B_GPTQ_INT4

This version of InternVL3_5-1B_GPTQ_INT4 has been converted to run on the Axera NPU using w4a16 quantization.

This model has been optimized with the following LoRA:

Compatible with Pulsar2 version: 5.1-patch1.

Please note that the context of the model is 2k and the maximum prefill length is 1k.

Convert tools links:

For those who are interested in model conversion, you can try to export axmodel through the original repo:

https://huggingface.co/OpenGVLab/InternVL3_5-1B

How to Convert LLM from Huggingface to axmodel

AXera NPU HOST LLM Runtime

AXera NPU AXCL LLM Runtime

Support Platform

Chips image encoder 448 ttft w4a16
AX650 364.412 ms 883.458 ms 28.09 tokens/sec
AX620E 2358.956 ms 3136.54 7.33 tokens/sec

How to use

Download all files from this repository to the device

$ tree -L 1
.
โ”œโ”€โ”€ assets
โ”œโ”€โ”€ config.json
โ”œโ”€โ”€ examples
โ”œโ”€โ”€ gradio_demo.py
โ”œโ”€โ”€ infer_axmodel.py
โ”œโ”€โ”€ infer_torch.py
โ”œโ”€โ”€ internvl3-5_axmodel
โ”œโ”€โ”€ internvl3-5_tokenizer
โ”œโ”€โ”€ README.md
โ”œโ”€โ”€ utils
โ””โ”€โ”€ vit-models

6 directories, 5 files

Install transformer

pip install transformers==4.57.1

Inference with AX650 Host, such as M4N-Dock(็ˆฑ่ŠฏๆดพPro) or AX650 DEMO Board

Interactive conversations using the C++ Gradio Demo (Updated: 2026.01.26):

Start the backend service:

./run_internvl_3-5_1b_448_ax650_api.sh

Reference log output:

root@ax650 ~/yongqiang/push_hugging_face/InternVL3_5-1B_GPTQ_INT4 # ./run_internvl_3-5_1b_448_ax650_api.sh
[I][                            Init][ 135]: LLM init start
[I][                            Init][ 137]: Total CMM:7915 MB
tokenizer_type = 3
  3% | โ–ˆโ–ˆ                                |   1 /  31 [0.75s<23.19s, 1.34 count/s] tokenizer init ok[I][                            Init][  26]: LLaMaEmbedSelector use mmap
  6% | โ–ˆโ–ˆโ–ˆ                               |   2 /  31 [0.75s<11.69s, 2.65 count/s] embed_selector init ok[I][                            Init][ 182]: attr.axmodel_num:28
 41% | โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ                    |  13 /  31 [3.42s<8.15s, 3.80 count/s] init 10 axmodel ok,remain_cmm(7596 MB 45% | โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ                   |  14 /  31 [3.68s<8.15s, 3.80 count/s] init 11 axmodel ok,remain_cmm(7567 MB 48% | โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ                  |  15 /  31 [3.95s<8.16s, 3.80 count/s] init 12 axmodel ok,remain_cmm(7538 MB 51% | โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ                 |  16 /  31 [4.19s<8.13s, 3.81 count/s] init 13 axmodel ok,remain_cmm(7509 MB 54% | โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ                |  17 /  31 [4.45s<8.12s, 3.82 count/s] init 14 axmodel ok,remain_cmm(7480 MB 58% | โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ               |  18 /  31 [4.70s<8.10s, 3.83 count/s] init 15 axmodel ok,remain_cmm(7451 MB 61% | โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ              |  19 /  31 [5.05s<8.25s, 3.76 count/s] init 16 axmodel ok,remain_cmm(7422 MB 64% | โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ             |  20 /  31 [5.30s<8.22s, 3.77 count/s] init 17 axmodel ok,remain_cmm(7393 MB 67% | โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ            |  21 /  31 [5.56s<8.21s, 3.78 count/s] init 18 axmodel ok,remain_cmm(7364 MB 70% | โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ           |  22 /  31 [5.81s<8.19s, 3.79 count/s] init 19 axmodel ok,remain_cmm(7335 MB 74% | โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ          |  23 /  31 [6.06s<8.17s, 3.79 count/s] init 20 axmodel ok,remain_cmm(7306 MB 77% | โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ         |  24 /  31 [6.32s<8.16s, 3.80 count/s] init 21 axmodel ok,remain_cmm(7277 MB 80% | โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ        |  25 /  31 [6.59s<8.17s, 3.79 count/s] init 22 axmodel ok,remain_cmm(7248 MB 83% | โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ       |  26 /  31 [6.86s<8.18s, 3.79 count/s] init 23 axmodel ok,remain_cmm(7219 MB 87% | โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ      |  27 /  31 [7.13s<8.18s, 3.79 count/s] init 24 axmodel ok,remain_cmm(7190 MB 90% | โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ     |  28 /  31 [7.39s<8.18s, 3.79 count/s] init 25 axmodel ok,remain_cmm(7161 MB 93% | โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ    |  29 /  31 [7.67s<8.19s, 3.78 count/s] init 26 axmodel ok,remain_cmm(7132 MB 96% | โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ   |  30 /  31 [7.93s<8.19s, 3.78 count/s] init 27 axmodel ok,remain_cmm(7103 MB100% | โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ |  31 /  31 [9.86s<9.86s, 3.14 count/s] init post axmodel ok,remain_cmm(6940 MB)[I][                            Init][ 240]: image encoder feature outputs:0
103% | โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ |  32 /  31 [13.95s<13.52s, 2.29 count/s] init vpm axmodel ok,remain_cmm(6588 MB)[I][                            Init][ 280]: image encoder input nhwc@uint8
[I][                            Init][ 305]: image encoder output float32

[I][                            Init][ 335]: max_token_len : 2047
[I][                            Init][ 340]: kv_cache_size : 1024, kv_cache_num: 2047
[I][                            Init][ 348]: prefill_token_num : 128
[I][                            Init][ 352]: grp: 1, prefill_max_token_num : 1
[I][                            Init][ 352]: grp: 2, prefill_max_token_num : 128
[I][                            Init][ 352]: grp: 3, prefill_max_token_num : 256
[I][                            Init][ 352]: grp: 4, prefill_max_token_num : 384
[I][                            Init][ 352]: grp: 5, prefill_max_token_num : 512
[I][                            Init][ 352]: grp: 6, prefill_max_token_num : 640
[I][                            Init][ 352]: grp: 7, prefill_max_token_num : 768
[I][                            Init][ 352]: grp: 8, prefill_max_token_num : 896
[I][                            Init][ 352]: grp: 9, prefill_max_token_num : 1024
[I][                            Init][ 356]: prefill_max_token_num : 1024
[I][                     load_config][ 281]: load config:
{
    "enable_repetition_penalty": true,
    "enable_temperature": true,
    "enable_top_k_sampling": true,
    "enable_top_p_sampling": false,
    "penalty_window": 30,
    "repetition_penalty": 1.2,
    "temperature": 0.7,
    "top_k": 10,
    "top_p": 0.9
}

[I][                            Init][ 373]: LLM init ok
[I][                            Init][ 375]: Left CMM:6588 MB
Server running on port 8000...

Run the Gradio frontend:

python3 gradio_demo_cpp_backend.py

Interactive conversations using the C++ Demo:

./run_internvl_3-5_1b_448_ax650.sh

The log information is as follows:

root@ax650 ~/yongqiang/push_hugging_face/InternVL3_5-1B_GPTQ_INT4 # ./run_internvl_3-5_1b_448_ax650.sh
[I][                            Init][ 135]: LLM init start
[I][                            Init][ 137]: Total CMM:7915 MB
tokenizer_type = 3
  3% | โ–ˆโ–ˆ                                |   1 /  31 [0.71s<21.92s, 1.41 count/s] tokenizer init ok[I][                            Init][  26]: LLaMaEmbedSelector use mmap
  6% | โ–ˆโ–ˆโ–ˆ                               |   2 /  31 [0.71s<11.05s, 2.81 count/s] embed_selector init ok[I][                            Init][ 182]: attr.axmodel_num:28
100% | โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ |  31 /  31 [2.06s<2.06s, 15.03 count/s] init post axmodel ok,remain_cmm(6940 MB)[I][                            Init][ 240]: image encoder feature outputs:0
103% | โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ |  32 /  31 [2.32s<2.25s, 13.79 count/s] init vpm axmodel ok,remain_cmm(6588 MB)[I][                            Init][ 280]: image encoder input nhwc@uint8
[I][                            Init][ 305]: image encoder output float32

[I][                            Init][ 335]: max_token_len : 2047
[I][                            Init][ 340]: kv_cache_size : 1024, kv_cache_num: 2047
[I][                            Init][ 348]: prefill_token_num : 128
[I][                            Init][ 352]: grp: 1, prefill_max_token_num : 1
[I][                            Init][ 352]: grp: 2, prefill_max_token_num : 128
[I][                            Init][ 352]: grp: 3, prefill_max_token_num : 256
[I][                            Init][ 352]: grp: 4, prefill_max_token_num : 384
[I][                            Init][ 352]: grp: 5, prefill_max_token_num : 512
[I][                            Init][ 352]: grp: 6, prefill_max_token_num : 640
[I][                            Init][ 352]: grp: 7, prefill_max_token_num : 768
[I][                            Init][ 352]: grp: 8, prefill_max_token_num : 896
[I][                            Init][ 352]: grp: 9, prefill_max_token_num : 1024
[I][                            Init][ 356]: prefill_max_token_num : 1024
[I][                     load_config][ 281]: load config:
{
    "enable_repetition_penalty": true,
    "enable_temperature": true,
    "enable_top_k_sampling": true,
    "enable_top_p_sampling": false,
    "penalty_window": 30,
    "repetition_penalty": 1.2,
    "temperature": 0.7,
    "top_k": 10,
    "top_p": 0.9
}

[I][                            Init][ 373]: LLM init ok
[I][                            Init][ 375]: Left CMM:6588 MB
Type "q" to exit, Ctrl+c to stop current running
prompt(่พ“ๅ…ฅq้€€ๅ‡บ) >> ไป‹็ปไธ€ไธ‹ไฝ ่‡ชๅทฑ
image(ๅ›ž่ฝฆ้”ฎ่ทณ่ฟ‡) >>
[I][                             Run][ 713]: input token num : 21, prefill_split_num : 1
[I][                             Run][ 747]: input_num_token:21
[I][                             Run][ 976]: ttft: 83.79 ms
ๆˆ‘่ขซ็งฐไธบ"่ฏญ่จ€ๆจกๅž‹-1.0"๏ผŒๆฅ่‡ชไธŠๆตทไบบๅทฅๆ™บ่ƒฝๅฎž้ชŒๅฎคใ€‚ๆˆ‘็š„ๅผ€ๅ‘ๅ›ข้˜Ÿ่‡ดๅŠ›ไบŽไธบ็”จๆˆทๆไพ›้ซ˜ๆ•ˆใ€ๅ‡†็กฎๅ’Œไธชๆ€งๅŒ–็š„AIๆœๅŠกใ€‚ไฝœไธบไธ€ๆฌพๅ…ˆ่ฟ›็š„่‡ช็„ถ่ฏญ่จ€ๅค„็†๏ผˆNLP๏ผ‰ๆจกๅž‹๏ผŒๆˆ‘ๆ—จๅœจๅธฎๅŠฉ็”จๆˆท่งฃๅ†ณๅ„็ง่ฏญ่จ€็›ธๅ…ณ้—ฎ้ข˜๏ผŒๅนถๆไพ›ๆœ‰็”จ็š„ไฟกๆฏๅ’Œๅปบ่ฎฎใ€‚ๆˆ‘็š„่ฎพ่ฎก็›ฎๆ ‡ๆ˜ฏ่ƒฝๅคŸไปฅ่‡ช็„ถๆต็•…็š„ๆ–นๅผไธŽไบบ็ฑป่ฟ›่กŒไบคไบ’๏ผŒๆ— ่ฎบๆ˜ฏๅ›ž็ญ”้—ฎ้ข˜ใ€ๆไพ›ๅปบ่ฎฎ่ฟ˜ๆ˜ฏๆ‰ง่กŒไปปๅŠกใ€‚

[N][                             Run][1102]: hit eos,avg 19.79 token/s

prompt(่พ“ๅ…ฅq้€€ๅ‡บ) >> ่ฏทไฝ ่ฏฆ็ป†ๆ่ฟฐไธ‹้ข่ฟ™ๅน…ๅ›พ
image(ๅ›ž่ฝฆ้”ฎ่ทณ่ฟ‡) >> assets/image_1.jpg
[I][                     EncodeImage][ 481]: image encode time : 408.467987 ms, size : 1
[I][                          Encode][ 636]: input_ids size:284
[I][                          Encode][ 644]: offset 15
[I][                          Encode][ 673]: img_embed.size:1, 262144
[I][                          Encode][ 689]: out_embed size:290816
[I][                          Encode][ 690]: input_ids size 284
[I][                          Encode][ 692]: position_ids size:284
[I][                             Run][ 713]: input token num : 284, prefill_split_num : 3
[I][                             Run][ 747]: input_num_token:128
[I][                             Run][ 747]: input_num_token:128
[I][                             Run][ 747]: input_num_token:28
[I][                             Run][ 976]: ttft: 270.76 ms
่ฟ™ๆ˜ฏไธ€ๅน…็”ŸๅŠจ็š„ๅ›พ็‰‡๏ผŒๅฑ•็คบไบ†ไธ€ๅชๅคง็†Š็Œซๆญฃๅœจ่‡ช็„ถ็Žฏๅขƒไธญ่ง…้ฃŸ็š„ๆƒ…ๆ™ฏใ€‚็”ป้ขไธญ๏ผŒๅคง็†Š็ŒซๆญฃไฝŽๅคดๅœจๆค็‰ฉไธ›ไธญๅฏปๆ‰พ้ฃŸ็‰ฉใ€‚ๅฎƒ็š„ๆฏ›ๅ‘ๅ‘ˆ็™ฝ่‰ฒ๏ผŒ่ƒŒ้ƒจๅ’Œ่…น้ƒจๆœ‰้ป‘่‰ฒๆ–‘็‚นใ€‚ๅ‘จๅ›ด็ปฟๆ„็›Ž็„ถ๏ผŒๅ„็ง็Œๆœจๅ’Œๆค็‰ฉ็Žฏ็ป•็€ๅฎƒ๏ผŒๆ˜พๅพ—็”Ÿๆœบๅ‹ƒๅ‹ƒใ€‚่ƒŒๆ™ฏ็š„ๆœจ่ดจ็ป“ๆž„ๅฏ่ƒฝๆ˜ฏไธ€ๆŠŠ็ซน็ซฟๆˆ–้•ฟๆค…๏ผŒ่ฟ›ไธ€ๆญฅๆš—็คบ่ฟ™ๅฏ่ƒฝๆ˜ฏๅŠจ็‰ฉๅ›ญๆˆ–้‡Ž็”ŸๅŠจ็‰ฉไฟๆŠคๅŒบใ€‚ๆ•ดไธชๅœบๆ™ฏๅ……ๆปกไบ†่‡ช็„ถ็š„ๆฐ”ๆฏ๏ผŒ่ฎฉไบบๆ„Ÿๅ—ๅˆฐๅคง่‡ช็„ถ็š„ๅฏ็ˆฑไธŽ็”Ÿๆœบใ€‚

[N][                             Run][1102]: hit eos,avg 19.86 token/s

prompt(่พ“ๅ…ฅq้€€ๅ‡บ) >>

Interactive conversations using the Gradio Python API:

$ python3 gradio_demo.py --hf_model internvl3-5_tokenizer/ --axmodel_path internvl3-5_axmodel/ --vit_model vit-models/internvl_vit_model_1x3x448x448.axmodel

Plain text dialogue:

demo_1

Image understanding:

demo_2


Run the following command on the Axera board to start a chat conversation:

$ python3 infer_axmodel.py --hf_model internvl3-5_tokenizer/ --axmodel_path internvl3-5_axmodel/ --question "่ฏท่ฎก็ฎ—ๅ‡ฝๆ•ฐ[y=2x^2+2]็š„ๅฏผๆ•ฐ, ๅนถๆไพ› markdown ๆ ผๅผ็š„ๆŽจ็†่ฟ‡็จ‹"

output:

[INFO] Using provider: AxEngineExecutionProvider
[INFO] Model type: 2 (triple core)
[INFO] Compiler version: 5.1-dirty 0fdbfe15-dirty
Model loaded successfully!
slice_indices: [0]
Slice prefill done: 0
answer >> ๅ‡ฝๆ•ฐ \( y = 2x^2 + 2 \) ็š„ๅฏผๆ•ฐๅฏไปฅ้€š่ฟ‡ๆฑ‚ๅฏผๆณ•ๅˆ™ๆฅ่ฎก็ฎ—ใ€‚้ฆ–ๅ…ˆ๏ผŒๆˆ‘ไปฌๅฏนๅ‡ฝๆ•ฐไธญ็š„ๆฏไธ€้กนๅˆ†ๅˆซๆฑ‚ๅฏผ๏ผš

1. ๅฏนไบŽ \( 2x^2 \)๏ผŒไฝฟ็”จๅน‚ๆณ•ๅˆ™ๆฑ‚ๅฏผ๏ผš
   \[
   \frac{d}{dx}(2x^2) = 2 \cdot 2x = 4x
   \]

2. ๅฏนไบŽๅธธๆ•ฐ้กน \( 2 \)๏ผŒๅ…ถๅฏผๆ•ฐไธบ 0๏ผŒๅ› ไธบๅธธๆ•ฐ็š„ๅฏผๆ•ฐไธบ 0ใ€‚

ๅฐ†่ฟ™ไธค้ƒจๅˆ†็š„็ป“ๆžœ็›ธๅŠ ๏ผŒๅพ—ๅˆฐๅ‡ฝๆ•ฐ \( y \) ็š„ๅฏผๆ•ฐ๏ผš
\[
y' = 4x
\]

ๅ› ๆญค๏ผŒๅ‡ฝๆ•ฐ \( y = 2x^2 + 2 \) ็š„ๅฏผๆ•ฐไธบ \( y' = 4x \)ใ€‚

Enter the following command to perform the single-image understanding task:

$ python3 infer_axmodel.py --hf_model internvl3-5_tokenizer/ --axmodel_path internvl3-5_axmodel/ --question "่ฏทๆ่ฟฐ่ฟ™ๅน…ๅ›พ" -i examples/image_0.jpg --vit_model vit-models/internvl_vit_model_1x3x448x448.axmodel

image_0.jpg

output:

[INFO] Model type: 2 (triple core)
[INFO] Compiler version: 5.1-dirty 0fdbfe15-dirty
Model loaded successfully!
slice_indices: [0, 1, 2]
Slice prefill done: 0
Slice prefill done: 1
Slice prefill done: 2
answer >> ่ฟ™ๆ˜ฏไธ€ๅผ ็บข็†Š็Œซ็š„็…ง็‰‡ใ€‚็บข็†Š็Œซๆ˜ฏไธ€็ง็บขๆฃ•่‰ฒ็š„ๅ“บไนณๅŠจ็‰ฉ๏ผŒ้€šๅธธ็”Ÿๆดปๅœจไบšๆดฒ็š„ๆฃฎๆž—ไธญใ€‚ๅฎƒไปฌไปฅๆ•้ฃŸๆ˜†่™ซๅ’Œๅฐๅž‹ๆ— ่„ŠๆคŽๅŠจ็‰ฉไธบ็”Ÿใ€‚ๅ›พ็‰‡ไธญ๏ผŒ็บข็†Š็Œซๆญฃๅๅœจไธ€ไธชๆœจๅˆถ็š„ๅนณๅฐไธŠ๏ผŒ่ƒŒๆ™ฏๆ˜ฏ็ปฟ่‰ฒ็š„ๆ ‘ๆœจๅ’Œๆค่ขซ๏ผŒๆ˜พๅพ—้žๅธธ่‡ช็„ถๅ’Œ็”ŸๅŠจใ€‚็บข็†Š็Œซ็š„่กจๆƒ…็œ‹่ตทๆฅๅพˆๅ‹ๅฅฝ๏ผŒไผผไนŽๅœจ่ง‚ๅฏŸๆˆ–็ญ‰ๅพ…ไป€ไนˆใ€‚
Downloads last month
47
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for AXERA-TECH/InternVL3_5-1B_GPTQ_INT4