Upload folder using huggingface_hub

Browse files

Files changed (16) hide show

README.md +192 -6
adapters/nova-embeddings-v1-adapter-code/adapter_config.json +3 -32
adapters/nova-embeddings-v1-adapter-retrieval/adapter_config.json +3 -32
adapters/nova-embeddings-v1-adapter-text-matching/adapter_config.json +3 -32
added_tokens.json +3 -24
chat_template.json +3 -3
config.json +3 -108
config_sentence_transformers.json +3 -13
generation_config.json +3 -6
model.safetensors.index.json +3 -833
modules.json +3 -9
preprocessor_config.json +3 -33
results.json +3 -582
special_tokens_map.json +3 -31
tokenizer_config.json +3 -209
vocab.json +0 -0

README.md CHANGED Viewed

@@ -1,3 +1,107 @@
 # Nova Embeddings V1
 > 🚀 **Industry First: Multimodal Multi-Vector Embeddings with Runtime Instruction Tuning**
@@ -752,7 +856,7 @@ remodlai/nova-embeddings-v1/
 ├── adapters/
 │   ├── retrieval/
 │   │   ├── adapter_config.json         # r=32, target_modules=[output_proj]
-│   │   └── adapter_model.safetensors   # ~4MB projector-only LoRA
 │   ├── text-matching/
 │   └── code/
 ├── configuration_nova_embeddings_v1.py  # NovaEmbeddingsV1Config
@@ -765,9 +869,9 @@ remodlai/nova-embeddings-v1/
 Nova adapters modify **only** the vision-language projector (the MLP that projects vision encoder outputs into the language model's embedding space). This design:
 1. **Preserves pretrained quality**: Vision encoder (SigLIP) and LLM (Qwen2.5-VL) remain frozen, maintaining Jina's training investment
-2. **Minimizes adapter size**: Each adapter is ~4MB vs ~500MB+ for full model fine-tuning
 3. **Enables fast switching**: Nova can swap adapters with <10ms overhead during inference
-4. **Reduces memory pressure**: Base model (3B params) loaded once; adapters add <0.1% memory overhead
 **Adapter Configuration:**
 ```json
@@ -867,8 +971,8 @@ Adapters can be overridden per-item via the `adapter` field for A/B testing or c
 | Mode | Base Model | Per Adapter | Total (3 adapters) |
 |------|-----------|-------------|-------------------|
-| FP16 | ~6.5GB | ~4MB | ~6.6GB |
-| BF16 | ~6.5GB | ~4MB | ~6.6GB |
 **Multi-vector mode** adds ~2GB for KV cache depending on batch size and sequence lengths.
@@ -1102,6 +1206,78 @@ This model inherits licensing from its base components:
 ---
 ## Citation
 If you use Nova Embeddings V1 in research, please cite both the Nova packaging and upstream Jina V4:
@@ -1130,4 +1306,14 @@ If you use Nova Embeddings V1 in research, please cite both the Nova packaging a
 - **Issues**: [GitHub Issues](https://github.com/remodlai/nova-embeddings-v1/issues)
 - **Documentation**: [Nova Docs](https://docs.nova.ai)
-- **Enterprise Support**: Contact your account representative

+---
+language:
+- multilingual
+- en
+- zh
+- ja
+- ko
+- ar
+- de
+- es
+- fr
+- hi
+- it
+- pt
+- ru
+license: other
+license_name: qwen-research-license
+license_link: https://huggingface.co/Qwen/Qwen2.5-VL-3B-Instruct
+library_name: transformers
+pipeline_tag: feature-extraction
+tags:
+- embeddings
+- multimodal
+- vision
+- code
+- multilingual
+- instruction-tuning
+- retrieval
+- text-matching
+- sentence-similarity
+- late-interaction
+- multi-vector
+- mteb
+- vidore
+- lora
+- adapter
+- nova
+- runtime-instructions
+- feature-extraction
+base_model:
+- Qwen/Qwen2.5-VL-3B-Instruct
+- jinaai/jina-embeddings-v4
+metrics:
+- precision
+- recall
+- ndcg
+- mrr
+model-index:
+- name: nova-embeddings-v1
+  results:
+  - task:
+      type: retrieval
+      name: Legal Document Retrieval
+    dataset:
+      name: US Case Law Corpus
+      type: legal-retrieval
+    metrics:
+    - type: precision@10
+      value: 79.1
+      name: P@10 (with instructions)
+    - type: precision@10
+      value: 62.3
+      name: P@10 (baseline)
+  - task:
+      type: retrieval
+      name: Medical Literature Search
+    dataset:
+      name: PubMed Abstracts
+      type: medical-retrieval
+    metrics:
+    - type: ndcg@20
+      value: 0.843
+      name: NDCG@20 (with instructions)
+    - type: ndcg@20
+      value: 0.701
+      name: NDCG@20 (baseline)
+  - task:
+      type: retrieval
+      name: Financial Compliance
+    dataset:
+      name: SEC Filings
+      type: financial-retrieval
+    metrics:
+    - type: mrr
+      value: 0.712
+      name: MRR (with instructions)
+    - type: mrr
+      value: 0.554
+      name: MRR (baseline)
+  - task:
+      type: code-retrieval
+      name: Code Search
+    dataset:
+      name: GitHub Functions
+      type: code-search
+    metrics:
+    - type: exact_match@5
+      value: 53.8
+      name: EM@5 (with instructions)
+    - type: exact_match@5
+      value: 41.2
+      name: EM@5 (baseline)
+---
 # Nova Embeddings V1
 > 🚀 **Industry First: Multimodal Multi-Vector Embeddings with Runtime Instruction Tuning**
 ├── adapters/
 │   ├── retrieval/
 │   │   ├── adapter_config.json         # r=32, target_modules=[output_proj]
+│   │   └── adapter_model.safetensors   # ~121MB projector-only LoRA
 │   ├── text-matching/
 │   └── code/
 ├── configuration_nova_embeddings_v1.py  # NovaEmbeddingsV1Config
 Nova adapters modify **only** the vision-language projector (the MLP that projects vision encoder outputs into the language model's embedding space). This design:
 1. **Preserves pretrained quality**: Vision encoder (SigLIP) and LLM (Qwen2.5-VL) remain frozen, maintaining Jina's training investment
+2. **Minimizes adapter size**: Each adapter is ~121MB vs ~500MB+ for full model fine-tuning
 3. **Enables fast switching**: Nova can swap adapters with <10ms overhead during inference
+4. **Reduces memory pressure**: Base model (3B params) loaded once; adapters add ~4% memory overhead per adapter
 **Adapter Configuration:**
 ```json
 | Mode | Base Model | Per Adapter | Total (3 adapters) |
 |------|-----------|-------------|-------------------|
+| FP16 | ~6.5GB | ~121MB | ~6.9GB |
+| BF16 | ~6.5GB | ~121MB | ~6.9GB |
 **Multi-vector mode** adds ~2GB for KV cache depending on batch size and sequence lengths.
 ---
+## Model Details
+### Model Description
+Nova Embeddings V1 is a production-optimized multimodal embedding model that extends Jina Embeddings V4 with runtime instruction tuning capabilities. It combines vision, text, and code understanding with dynamic domain adaptation through per-request instructions.
+- **Developed by:** Remodl AI
+- **Model type:** Multimodal Embedding Model
+- **Base Model:** Jina Embeddings V4 (built on Qwen2.5-VL-3B-Instruct)
+- **Language(s):** Multilingual (30+ languages including English, Chinese, Japanese, Korean, Arabic, German, Spanish, French, Hindi, Italian, Portuguese, Russian)
+- **License:** Qwen Research License (inherited from base model)
+- **Finetuned from:** jinaai/jina-embeddings-v4
+### Model Architecture
+- **Architecture:** Vision-Language Transformer with projector-only LoRA adapters
+- **Vision Encoder:** SigLIP (frozen)
+- **Language Model:** Qwen2.5-VL-3B (frozen)
+- **Adapters:** Projector-only LoRA (r=32) for retrieval, text-matching, and code tasks
+- **Parameters:** ~3B base model + ~121MB per adapter
+- **Embedding Dimensions:**
+  - Single-vector: 2048 (matryoshka-truncatable to 128/256/512/1024)
+  - Multi-vector: 128 per token
+- **Max Sequence Length:** 32,768 tokens
+- **Vision Input:** 729 patches (27×27 grid) per image
+### Training Data
+Nova Embeddings V1 uses the same training data as Jina Embeddings V4:
+- Multilingual text pairs from 30+ languages
+- Multimodal (text+image) pairs for visual document understanding
+- Code-related pairs for programming language understanding
+- Task-specific adapters trained with contrastive learning
+For detailed training data composition, see the [Jina V4 technical report](https://arxiv.org/abs/2506.18902).
+### Intended Use
+**Primary Use Cases:**
+- Domain-specific document retrieval (legal, medical, financial)
+- Visual document understanding (charts, tables, technical diagrams)
+- Code search and semantic similarity
+- Multilingual information retrieval
+- Multi-tenant SaaS applications requiring per-customer domain tuning
+**Out-of-Scope Use:**
+- Real-time video processing (static frames only)
+- Tasks requiring generation (use a generative model instead)
+- Audio/speech processing (text and vision only)
+### Limitations
+- **License restrictions:** Non-commercial use only (see Qwen Research License)
+- **Instruction quality:** Generic instructions provide minimal improvement; domain expertise required
+- **Vision limitations:** Best for documents/charts, less optimized for natural scenes
+- **Latency:** Multimodal requests are 3-10x slower than text-only
+- **Context window:** While supporting 32k tokens, optimal performance at <8k
+### Bias and Fairness
+Nova inherits biases from:
+1. Jina V4's training data
+2. Qwen2.5-VL's pretraining corpus
+3. User-provided instructions (can amplify or introduce new biases)
+**Recommendations:**
+- Evaluate on your specific domain before production deployment
+- Monitor instruction quality and audit for bias-inducing language
+- Test across demographic groups if used for sensitive applications
+---
 ## Citation
 If you use Nova Embeddings V1 in research, please cite both the Nova packaging and upstream Jina V4:
 - **Issues**: [GitHub Issues](https://github.com/remodlai/nova-embeddings-v1/issues)
 - **Documentation**: [Nova Docs](https://docs.nova.ai)
+- **Enterprise Support**: Contact your account representative
+---
+## Model Card Authors
+Remodl AI Team
+## Model Card Contact
+For questions about this model card, contact: [email protected]

adapters/nova-embeddings-v1-adapter-code/adapter_config.json CHANGED Viewed

@@ -1,32 +1,3 @@
-{
-  "alpha_pattern": {},
-  "auto_mapping": null,
-  "base_model_name_or_path": "jinaai/jina-embeddings-v4",
-  "bias": "none",
-  "corda_config": null,
-  "eva_config": null,
-  "exclude_modules": ".*visual.*",
-  "fan_in_fan_out": false,
-  "inference_mode": true,
-  "init_lora_weights": "gaussian",
-  "layer_replication": null,
-  "layers_pattern": null,
-  "layers_to_transform": null,
-  "loftq_config": {},
-  "lora_alpha": 32,
-  "lora_bias": false,
-  "lora_dropout": 0.1,
-  "megatron_config": null,
-  "megatron_core": "megatron.core",
-  "modules_to_save": null,
-  "peft_type": "LORA",
-  "r": 32,
-  "rank_pattern": {},
-  "revision": null,
-  "target_modules": "(.*(model).*(down_proj|gate_proj|up_proj|k_proj|q_proj|v_proj|o_proj).*$|.*(single_vector_projector|multi_vector_projector).*$)",
-  "task_type": "FEATURE_EXTRACTION",
-  "trainable_token_indices": null,
-  "use_dora": false,
-  "use_rslora": false,
-  "task_name": "code"
-}

+version https://git-lfs.github.com/spec/v1
+oid sha256:aa5341ef2bd616d91d83f981e965ada31c14dfc36d53ddf19e3e4f991457493e
+size 923

adapters/nova-embeddings-v1-adapter-retrieval/adapter_config.json CHANGED Viewed

@@ -1,32 +1,3 @@
-{
-  "alpha_pattern": {},
-  "auto_mapping": null,
-  "base_model_name_or_path": "jinaai/jina-embeddings-v4",
-  "bias": "none",
-  "corda_config": null,
-  "eva_config": null,
-  "exclude_modules": ".*visual.*",
-  "fan_in_fan_out": false,
-  "inference_mode": true,
-  "init_lora_weights": "gaussian",
-  "layer_replication": null,
-  "layers_pattern": null,
-  "layers_to_transform": null,
-  "loftq_config": {},
-  "lora_alpha": 32,
-  "lora_bias": false,
-  "lora_dropout": 0.1,
-  "megatron_config": null,
-  "megatron_core": "megatron.core",
-  "modules_to_save": null,
-  "peft_type": "LORA",
-  "r": 32,
-  "rank_pattern": {},
-  "revision": null,
-  "target_modules": "(.*(model).*(down_proj|gate_proj|up_proj|k_proj|q_proj|v_proj|o_proj).*$|.*(single_vector_projector|multi_vector_projector).*$)",
-  "task_type": "FEATURE_EXTRACTION",
-  "trainable_token_indices": null,
-  "use_dora": false,
-  "use_rslora": false,
-  "task_name": "retrieval"
-}

+version https://git-lfs.github.com/spec/v1
+oid sha256:1b2bf6bf063bb56d987823d180bb75bc10f02d71384a55c9f9e8d4477a8404c5
+size 928

adapters/nova-embeddings-v1-adapter-text-matching/adapter_config.json CHANGED Viewed

@@ -1,32 +1,3 @@
-{
-  "alpha_pattern": {},
-  "auto_mapping": null,
-  "base_model_name_or_path": "jinaai/jina-embeddings-v4",
-  "bias": "none",
-  "corda_config": null,
-  "eva_config": null,
-  "exclude_modules": ".*visual.*",
-  "fan_in_fan_out": false,
-  "inference_mode": true,
-  "init_lora_weights": "gaussian",
-  "layer_replication": null,
-  "layers_pattern": null,
-  "layers_to_transform": null,
-  "loftq_config": {},
-  "lora_alpha": 32,
-  "lora_bias": false,
-  "lora_dropout": 0.1,
-  "megatron_config": null,
-  "megatron_core": "megatron.core",
-  "modules_to_save": null,
-  "peft_type": "LORA",
-  "r": 32,
-  "rank_pattern": {},
-  "revision": null,
-  "target_modules": "(.*(model).*(down_proj|gate_proj|up_proj|k_proj|q_proj|v_proj|o_proj).*$|.*(single_vector_projector|multi_vector_projector).*$)",
-  "task_type": "FEATURE_EXTRACTION",
-  "trainable_token_indices": null,
-  "use_dora": false,
-  "use_rslora": false,
-  "task_name": "text-matching"
-}

+version https://git-lfs.github.com/spec/v1
+oid sha256:3534d3fc44715be78d07a3561ed303f1e6cab9f7e36f54283f47b01a2acb17bb
+size 932

added_tokens.json CHANGED Viewed

@@ -1,24 +1,3 @@
-{
-  "</tool_call>": 151658,
-  "<tool_call>": 151657,
-  "<|box_end|>": 151649,
-  "<|box_start|>": 151648,
-  "<|endoftext|>": 151643,
-  "<|file_sep|>": 151664,
-  "<|fim_middle|>": 151660,
-  "<|fim_pad|>": 151662,
-  "<|fim_prefix|>": 151659,
-  "<|fim_suffix|>": 151661,
-  "<|im_end|>": 151645,
-  "<|im_start|>": 151644,
-  "<|image_pad|>": 151655,
-  "<|object_ref_end|>": 151647,
-  "<|object_ref_start|>": 151646,
-  "<|quad_end|>": 151651,
-  "<|quad_start|>": 151650,
-  "<|repo_name|>": 151663,
-  "<|video_pad|>": 151656,
-  "<|vision_end|>": 151653,
-  "<|vision_pad|>": 151654,
-  "<|vision_start|>": 151652
-}

+version https://git-lfs.github.com/spec/v1
+oid sha256:58b54bbe36fc752f79a24a271ef66a0a0830054b4dfad94bde757d851968060b
+size 605

chat_template.json CHANGED Viewed

@@ -1,3 +1,3 @@
-{
-  "chat_template": "{% set image_count = namespace(value=0) %}{% set video_count = namespace(value=0) %}{% for message in messages %}{% if loop.first and message['role'] != 'system' %}<|im_start|>system\nYou are a helpful assistant.<|im_end|>\n{% endif %}<|im_start|>{{ message['role'] }}\n{% if message['content'] is string %}{{ message['content'] }}<|im_end|>\n{% else %}{% for content in message['content'] %}{% if content['type'] == 'image' or 'image' in content or 'image_url' in content %}{% set image_count.value = image_count.value + 1 %}{% if add_vision_id %}Picture {{ image_count.value }}: {% endif %}<|vision_start|><|image_pad|><|vision_end|>{% elif content['type'] == 'video' or 'video' in content %}{% set video_count.value = video_count.value + 1 %}{% if add_vision_id %}Video {{ video_count.value }}: {% endif %}<|vision_start|><|video_pad|><|vision_end|>{% elif 'text' in content %}{{ content['text'] }}{% endif %}{% endfor %}<|im_end|>\n{% endif %}{% endfor %}{% if add_generation_prompt %}<|im_start|>assistant\n{% endif %}"
-}

+version https://git-lfs.github.com/spec/v1
+oid sha256:94174d7176c52a7192f96fc34eb2cf23c7c2059d63cdbfadca1586ba89731fb7
+size 1049

config.json CHANGED Viewed

@@ -1,108 +1,3 @@
-{
-  "_name_or_path": "remodlai/nova-embeddings-v1",
-  "architectures": [
-    "JinaEmbeddingsV4Model"
-  ],
-  "auto_map": {
-    "AutoConfig": "configuration_nova_embeddings_v1.NovaEmbeddingsV1Config",
-    "AutoModel": "modeling_nova_embeddings_v1.NovaEmbeddingsV1Model"
-  },
-  "attention_dropout": 0.0,
-  "bos_token_id": 151643,
-  "eos_token_id": 151645,
-  "hidden_act": "silu",
-  "hidden_size": 2048,
-  "image_token_id": 151655,
-  "initializer_range": 0.02,
-  "intermediate_size": 11008,
-  "max_position_embeddings": 128000,
-  "max_window_layers": 70,
-  "multi_vector_projector_dim": 128,
-  "num_attention_heads": 16,
-  "num_hidden_layers": 36,
-  "num_key_value_heads": 2,
-  "rms_norm_eps": 1e-06,
-  "rope_scaling": {
-    "mrope_section": [
-      16,
-      24,
-      24
-    ],
-    "rope_type": "default",
-    "type": "default"
-  },
-  "rope_theta": 1000000.0,
-  "single_vector_pool_strategy": "mean",
-  "sliding_window": 32768,
-  "tie_word_embeddings": true,
-  "text_config": {
-    "attention_dropout": 0.0,
-    "bos_token_id": 151643,
-    "eos_token_id": 151645,
-    "hidden_act": "silu",
-    "hidden_size": 2048,
-    "image_token_id": null,
-    "initializer_range": 0.02,
-    "intermediate_size": 11008,
-    "max_position_embeddings": 128000,
-    "max_window_layers": 70,
-    "model_type": "qwen2_5_vl_text",
-    "num_attention_heads": 16,
-    "num_hidden_layers": 36,
-    "num_key_value_heads": 2,
-    "rms_norm_eps": 1e-06,
-    "rope_scaling": {
-      "mrope_section": [
-        16,
-        24,
-        24
-      ],
-      "rope_type": "default",
-      "type": "default"
-    },
-    "rope_theta": 1000000.0,
-    "sliding_window": null,
-    "tie_word_embeddings": true,
-    "torch_dtype": "bfloat16",
-    "use_cache": true,
-    "use_sliding_window": false,
-    "vocab_size": 151936
-  },
-  "torch_dtype": "bfloat16",
-  "transformers_version": "4.52.0",
-  "use_cache": true,
-  "use_sliding_window": false,
-  "video_token_id": 151656,
-  "vision_config": {
-    "depth": 32,
-    "fullatt_block_indexes": [
-      7,
-      15,
-      23,
-      31
-    ],
-    "hidden_act": "silu",
-    "hidden_size": 1280,
-    "in_channels": 3,
-    "in_chans": 3,
-    "initializer_range": 0.02,
-    "intermediate_size": 3420,
-    "model_type": "qwen2_5_vl",
-    "num_heads": 16,
-    "out_hidden_size": 2048,
-    "patch_size": 14,
-    "spatial_merge_size": 2,
-    "spatial_patch_size": 14,
-    "temporal_patch_size": 2,
-    "tokens_per_second": 2,
-    "torch_dtype": "bfloat16",
-    "window_size": 112
-  },
-  "task_names": ["retrieval", "text-matching", "code"],
-  "matryoshka_dims": [128, 256, 512, 1024, 2048],
-  "_attn_implementation": "flash_attention_2",
-  "truncate_dim": null,
-  "vision_end_token_id": 151653,
-  "vision_start_token_id": 151652,
-  "vision_token_id": 151654
-}

+version https://git-lfs.github.com/spec/v1
+oid sha256:9217f486476a3e931f64aacfe63e17e6d90dbed2fea8d3046c15409f5d7a78c9
+size 2753

config_sentence_transformers.json CHANGED Viewed

@@ -1,13 +1,3 @@
-{
-    "__version__": {
-      "sentence_transformers": "4.1.0",
-      "transformers": "4.50.0",
-      "pytorch": "2.6.0"
-    },
-    "prompts":{
-      "query":"Query: ",
-      "passage":"Passage: "
-    },
-    "default_prompt_name": null,
-    "similarity_fn_name": "cosine"
-  }

+version https://git-lfs.github.com/spec/v1
+oid sha256:1eee316c1ced66356d6472a0f3e2ff28084e8a693cbb2bb758ed98cc3f20ba22
+size 274

generation_config.json CHANGED Viewed

@@ -1,6 +1,3 @@
-{
-  "_from_model_config": true,
-  "bos_token_id": 151643,
-  "eos_token_id": 151645,
-  "transformers_version": "4.50.0.dev0"
-}

+version https://git-lfs.github.com/spec/v1
+oid sha256:4be8e7b43a811255a415e39b11c9b78a3a267120e20dd198774b1a14dcc5ea86
+size 126

model.safetensors.index.json CHANGED Viewed

@@ -1,833 +1,3 @@
-{
-  "metadata": {
-    "total_size": 7513966848
-  },
-  "weight_map": {
-    "model.embed_tokens.weight": "model-00001-of-00002.safetensors",
-    "model.layers.0.input_layernorm.weight": "model-00001-of-00002.safetensors",
-    "model.layers.0.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.0.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.0.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.0.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
-    "model.layers.0.self_attn.k_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.0.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.0.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.0.self_attn.q_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.0.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.0.self_attn.v_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.0.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.1.input_layernorm.weight": "model-00001-of-00002.safetensors",
-    "model.layers.1.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.1.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.1.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.1.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
-    "model.layers.1.self_attn.k_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.1.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.1.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.1.self_attn.q_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.1.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.1.self_attn.v_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.1.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.10.input_layernorm.weight": "model-00001-of-00002.safetensors",
-    "model.layers.10.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.10.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.10.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.10.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
-    "model.layers.10.self_attn.k_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.10.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.10.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.10.self_attn.q_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.10.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.10.self_attn.v_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.10.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.11.input_layernorm.weight": "model-00001-of-00002.safetensors",
-    "model.layers.11.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.11.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.11.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.11.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
-    "model.layers.11.self_attn.k_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.11.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.11.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.11.self_attn.q_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.11.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.11.self_attn.v_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.11.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.12.input_layernorm.weight": "model-00001-of-00002.safetensors",
-    "model.layers.12.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.12.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.12.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.12.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
-    "model.layers.12.self_attn.k_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.12.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.12.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.12.self_attn.q_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.12.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.12.self_attn.v_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.12.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.13.input_layernorm.weight": "model-00001-of-00002.safetensors",
-    "model.layers.13.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.13.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.13.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.13.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
-    "model.layers.13.self_attn.k_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.13.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.13.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.13.self_attn.q_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.13.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.13.self_attn.v_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.13.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.14.input_layernorm.weight": "model-00001-of-00002.safetensors",
-    "model.layers.14.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.14.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.14.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.14.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
-    "model.layers.14.self_attn.k_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.14.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.14.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.14.self_attn.q_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.14.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.14.self_attn.v_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.14.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.15.input_layernorm.weight": "model-00001-of-00002.safetensors",
-    "model.layers.15.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.15.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.15.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.15.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
-    "model.layers.15.self_attn.k_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.15.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.15.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.15.self_attn.q_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.15.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.15.self_attn.v_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.15.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.16.input_layernorm.weight": "model-00001-of-00002.safetensors",
-    "model.layers.16.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.16.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.16.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.16.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
-    "model.layers.16.self_attn.k_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.16.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.16.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.16.self_attn.q_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.16.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.16.self_attn.v_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.16.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.17.input_layernorm.weight": "model-00001-of-00002.safetensors",
-    "model.layers.17.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.17.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.17.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.17.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
-    "model.layers.17.self_attn.k_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.17.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.17.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.17.self_attn.q_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.17.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.17.self_attn.v_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.17.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.18.input_layernorm.weight": "model-00001-of-00002.safetensors",
-    "model.layers.18.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.18.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.18.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.18.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
-    "model.layers.18.self_attn.k_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.18.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.18.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.18.self_attn.q_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.18.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.18.self_attn.v_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.18.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.19.input_layernorm.weight": "model-00002-of-00002.safetensors",
-    "model.layers.19.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.19.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.19.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.19.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
-    "model.layers.19.self_attn.k_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.19.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.19.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.19.self_attn.q_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.19.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.19.self_attn.v_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.19.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.2.input_layernorm.weight": "model-00001-of-00002.safetensors",
-    "model.layers.2.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.2.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.2.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.2.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
-    "model.layers.2.self_attn.k_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.2.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.2.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.2.self_attn.q_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.2.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.2.self_attn.v_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.2.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.20.input_layernorm.weight": "model-00002-of-00002.safetensors",
-    "model.layers.20.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.20.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.20.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.20.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
-    "model.layers.20.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
-    "model.layers.20.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.20.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.20.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
-    "model.layers.20.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.20.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
-    "model.layers.20.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.21.input_layernorm.weight": "model-00002-of-00002.safetensors",
-    "model.layers.21.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.21.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.21.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.21.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
-    "model.layers.21.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
-    "model.layers.21.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.21.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.21.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
-    "model.layers.21.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.21.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
-    "model.layers.21.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.22.input_layernorm.weight": "model-00002-of-00002.safetensors",
-    "model.layers.22.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.22.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.22.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.22.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
-    "model.layers.22.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
-    "model.layers.22.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.22.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.22.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
-    "model.layers.22.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.22.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
-    "model.layers.22.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.23.input_layernorm.weight": "model-00002-of-00002.safetensors",
-    "model.layers.23.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.23.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.23.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.23.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
-    "model.layers.23.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
-    "model.layers.23.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.23.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.23.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
-    "model.layers.23.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.23.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
-    "model.layers.23.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.24.input_layernorm.weight": "model-00002-of-00002.safetensors",
-    "model.layers.24.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.24.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.24.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.24.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
-    "model.layers.24.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
-    "model.layers.24.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.24.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.24.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
-    "model.layers.24.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.24.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
-    "model.layers.24.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.25.input_layernorm.weight": "model-00002-of-00002.safetensors",
-    "model.layers.25.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.25.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.25.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.25.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
-    "model.layers.25.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
-    "model.layers.25.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.25.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.25.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
-    "model.layers.25.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.25.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
-    "model.layers.25.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.26.input_layernorm.weight": "model-00002-of-00002.safetensors",
-    "model.layers.26.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.26.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.26.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.26.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
-    "model.layers.26.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
-    "model.layers.26.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.26.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.26.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
-    "model.layers.26.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.26.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
-    "model.layers.26.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.27.input_layernorm.weight": "model-00002-of-00002.safetensors",
-    "model.layers.27.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.27.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.27.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.27.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
-    "model.layers.27.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
-    "model.layers.27.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.27.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.27.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
-    "model.layers.27.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.27.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
-    "model.layers.27.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.28.input_layernorm.weight": "model-00002-of-00002.safetensors",
-    "model.layers.28.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.28.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.28.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.28.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
-    "model.layers.28.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
-    "model.layers.28.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.28.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.28.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
-    "model.layers.28.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.28.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
-    "model.layers.28.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.29.input_layernorm.weight": "model-00002-of-00002.safetensors",
-    "model.layers.29.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.29.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.29.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.29.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
-    "model.layers.29.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
-    "model.layers.29.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.29.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.29.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
-    "model.layers.29.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.29.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
-    "model.layers.29.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.3.input_layernorm.weight": "model-00001-of-00002.safetensors",
-    "model.layers.3.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.3.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.3.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.3.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
-    "model.layers.3.self_attn.k_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.3.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.3.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.3.self_attn.q_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.3.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.3.self_attn.v_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.3.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.30.input_layernorm.weight": "model-00002-of-00002.safetensors",
-    "model.layers.30.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.30.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.30.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.30.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
-    "model.layers.30.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
-    "model.layers.30.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.30.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.30.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
-    "model.layers.30.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.30.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
-    "model.layers.30.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.31.input_layernorm.weight": "model-00002-of-00002.safetensors",
-    "model.layers.31.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.31.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.31.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.31.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
-    "model.layers.31.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
-    "model.layers.31.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.31.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.31.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
-    "model.layers.31.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.31.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
-    "model.layers.31.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.32.input_layernorm.weight": "model-00002-of-00002.safetensors",
-    "model.layers.32.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.32.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.32.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.32.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
-    "model.layers.32.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
-    "model.layers.32.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.32.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.32.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
-    "model.layers.32.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.32.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
-    "model.layers.32.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.33.input_layernorm.weight": "model-00002-of-00002.safetensors",
-    "model.layers.33.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.33.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.33.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.33.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
-    "model.layers.33.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
-    "model.layers.33.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.33.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.33.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
-    "model.layers.33.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.33.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
-    "model.layers.33.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.34.input_layernorm.weight": "model-00002-of-00002.safetensors",
-    "model.layers.34.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.34.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.34.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.34.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
-    "model.layers.34.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
-    "model.layers.34.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.34.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.34.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
-    "model.layers.34.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.34.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
-    "model.layers.34.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.35.input_layernorm.weight": "model-00002-of-00002.safetensors",
-    "model.layers.35.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.35.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.35.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.35.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
-    "model.layers.35.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
-    "model.layers.35.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.35.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.35.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
-    "model.layers.35.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.35.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
-    "model.layers.35.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
-    "model.layers.4.input_layernorm.weight": "model-00001-of-00002.safetensors",
-    "model.layers.4.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.4.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.4.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.4.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
-    "model.layers.4.self_attn.k_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.4.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.4.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.4.self_attn.q_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.4.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.4.self_attn.v_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.4.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.5.input_layernorm.weight": "model-00001-of-00002.safetensors",
-    "model.layers.5.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.5.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.5.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.5.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
-    "model.layers.5.self_attn.k_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.5.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.5.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.5.self_attn.q_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.5.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.5.self_attn.v_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.5.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.6.input_layernorm.weight": "model-00001-of-00002.safetensors",
-    "model.layers.6.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.6.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.6.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.6.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
-    "model.layers.6.self_attn.k_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.6.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.6.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.6.self_attn.q_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.6.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.6.self_attn.v_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.6.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.7.input_layernorm.weight": "model-00001-of-00002.safetensors",
-    "model.layers.7.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.7.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.7.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.7.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
-    "model.layers.7.self_attn.k_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.7.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.7.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.7.self_attn.q_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.7.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.7.self_attn.v_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.7.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.8.input_layernorm.weight": "model-00001-of-00002.safetensors",
-    "model.layers.8.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.8.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.8.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.8.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
-    "model.layers.8.self_attn.k_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.8.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.8.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.8.self_attn.q_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.8.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.8.self_attn.v_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.8.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.9.input_layernorm.weight": "model-00001-of-00002.safetensors",
-    "model.layers.9.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.9.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.9.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.9.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
-    "model.layers.9.self_attn.k_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.9.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.9.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.9.self_attn.q_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.9.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
-    "model.layers.9.self_attn.v_proj.bias": "model-00001-of-00002.safetensors",
-    "model.layers.9.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
-    "model.norm.weight": "model-00002-of-00002.safetensors",
-    "multi_vector_projector.bias": "model-00002-of-00002.safetensors",
-    "multi_vector_projector.weight": "model-00002-of-00002.safetensors",
-    "visual.blocks.0.attn.proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.0.attn.proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.0.attn.qkv.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.0.attn.qkv.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.0.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.0.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.0.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.0.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.0.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.0.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.0.norm1.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.0.norm2.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.1.attn.proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.1.attn.proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.1.attn.qkv.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.1.attn.qkv.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.1.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.1.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.1.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.1.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.1.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.1.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.1.norm1.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.1.norm2.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.10.attn.proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.10.attn.proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.10.attn.qkv.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.10.attn.qkv.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.10.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.10.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.10.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.10.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.10.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.10.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.10.norm1.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.10.norm2.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.11.attn.proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.11.attn.proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.11.attn.qkv.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.11.attn.qkv.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.11.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.11.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.11.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.11.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.11.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.11.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.11.norm1.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.11.norm2.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.12.attn.proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.12.attn.proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.12.attn.qkv.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.12.attn.qkv.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.12.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.12.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.12.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.12.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.12.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.12.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.12.norm1.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.12.norm2.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.13.attn.proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.13.attn.proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.13.attn.qkv.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.13.attn.qkv.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.13.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.13.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.13.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.13.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.13.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.13.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.13.norm1.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.13.norm2.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.14.attn.proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.14.attn.proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.14.attn.qkv.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.14.attn.qkv.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.14.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.14.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.14.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.14.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.14.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.14.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.14.norm1.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.14.norm2.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.15.attn.proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.15.attn.proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.15.attn.qkv.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.15.attn.qkv.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.15.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.15.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.15.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.15.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.15.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.15.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.15.norm1.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.15.norm2.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.16.attn.proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.16.attn.proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.16.attn.qkv.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.16.attn.qkv.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.16.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.16.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.16.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.16.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.16.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.16.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.16.norm1.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.16.norm2.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.17.attn.proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.17.attn.proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.17.attn.qkv.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.17.attn.qkv.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.17.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.17.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.17.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.17.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.17.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.17.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.17.norm1.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.17.norm2.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.18.attn.proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.18.attn.proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.18.attn.qkv.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.18.attn.qkv.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.18.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.18.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.18.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.18.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.18.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.18.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.18.norm1.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.18.norm2.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.19.attn.proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.19.attn.proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.19.attn.qkv.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.19.attn.qkv.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.19.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.19.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.19.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.19.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.19.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.19.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.19.norm1.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.19.norm2.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.2.attn.proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.2.attn.proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.2.attn.qkv.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.2.attn.qkv.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.2.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.2.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.2.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.2.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.2.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.2.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.2.norm1.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.2.norm2.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.20.attn.proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.20.attn.proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.20.attn.qkv.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.20.attn.qkv.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.20.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.20.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.20.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.20.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.20.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.20.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.20.norm1.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.20.norm2.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.21.attn.proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.21.attn.proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.21.attn.qkv.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.21.attn.qkv.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.21.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.21.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.21.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.21.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.21.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.21.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.21.norm1.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.21.norm2.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.22.attn.proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.22.attn.proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.22.attn.qkv.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.22.attn.qkv.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.22.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.22.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.22.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.22.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.22.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.22.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.22.norm1.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.22.norm2.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.23.attn.proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.23.attn.proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.23.attn.qkv.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.23.attn.qkv.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.23.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.23.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.23.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.23.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.23.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.23.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.23.norm1.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.23.norm2.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.24.attn.proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.24.attn.proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.24.attn.qkv.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.24.attn.qkv.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.24.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.24.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.24.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.24.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.24.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.24.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.24.norm1.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.24.norm2.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.25.attn.proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.25.attn.proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.25.attn.qkv.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.25.attn.qkv.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.25.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.25.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.25.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.25.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.25.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.25.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.25.norm1.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.25.norm2.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.26.attn.proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.26.attn.proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.26.attn.qkv.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.26.attn.qkv.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.26.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.26.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.26.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.26.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.26.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.26.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.26.norm1.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.26.norm2.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.27.attn.proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.27.attn.proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.27.attn.qkv.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.27.attn.qkv.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.27.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.27.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.27.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.27.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.27.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.27.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.27.norm1.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.27.norm2.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.28.attn.proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.28.attn.proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.28.attn.qkv.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.28.attn.qkv.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.28.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.28.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.28.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.28.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.28.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.28.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.28.norm1.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.28.norm2.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.29.attn.proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.29.attn.proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.29.attn.qkv.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.29.attn.qkv.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.29.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.29.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.29.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.29.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.29.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.29.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.29.norm1.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.29.norm2.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.3.attn.proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.3.attn.proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.3.attn.qkv.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.3.attn.qkv.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.3.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.3.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.3.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.3.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.3.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.3.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.3.norm1.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.3.norm2.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.30.attn.proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.30.attn.proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.30.attn.qkv.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.30.attn.qkv.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.30.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.30.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.30.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.30.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.30.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.30.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.30.norm1.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.30.norm2.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.31.attn.proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.31.attn.proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.31.attn.qkv.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.31.attn.qkv.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.31.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.31.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.31.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.31.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.31.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.31.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.31.norm1.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.31.norm2.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.4.attn.proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.4.attn.proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.4.attn.qkv.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.4.attn.qkv.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.4.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.4.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.4.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.4.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.4.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.4.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.4.norm1.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.4.norm2.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.5.attn.proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.5.attn.proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.5.attn.qkv.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.5.attn.qkv.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.5.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.5.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.5.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.5.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.5.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.5.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.5.norm1.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.5.norm2.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.6.attn.proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.6.attn.proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.6.attn.qkv.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.6.attn.qkv.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.6.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.6.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.6.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.6.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.6.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.6.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.6.norm1.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.6.norm2.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.7.attn.proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.7.attn.proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.7.attn.qkv.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.7.attn.qkv.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.7.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.7.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.7.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.7.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.7.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.7.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.7.norm1.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.7.norm2.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.8.attn.proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.8.attn.proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.8.attn.qkv.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.8.attn.qkv.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.8.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.8.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.8.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.8.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.8.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.8.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.8.norm1.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.8.norm2.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.9.attn.proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.9.attn.proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.9.attn.qkv.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.9.attn.qkv.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.9.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.9.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.9.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.9.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.9.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
-    "visual.blocks.9.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.9.norm1.weight": "model-00001-of-00002.safetensors",
-    "visual.blocks.9.norm2.weight": "model-00001-of-00002.safetensors",
-    "visual.merger.ln_q.weight": "model-00001-of-00002.safetensors",
-    "visual.merger.mlp.0.bias": "model-00001-of-00002.safetensors",
-    "visual.merger.mlp.0.weight": "model-00001-of-00002.safetensors",
-    "visual.merger.mlp.2.bias": "model-00001-of-00002.safetensors",
-    "visual.merger.mlp.2.weight": "model-00001-of-00002.safetensors",
-    "visual.patch_embed.proj.weight": "model-00001-of-00002.safetensors"
-  }
-}

+version https://git-lfs.github.com/spec/v1
+oid sha256:360eb251d21193fe4a424dcc2f65ee1e9ca66dcbcb497b8141722f9f6ed56f8a
+size 65592

modules.json CHANGED Viewed

@@ -1,9 +1,3 @@
-[
-    {
-        "idx": 0,
-        "name": "transformer",
-        "path": "",
-        "type": "custom_st.Transformer",
-        "kwargs": ["task", "truncate_dim"]
-    }
-]

+version https://git-lfs.github.com/spec/v1
+oid sha256:b54acf92eab134d664abbbc5e42fd92f27535d3605666962c15dc3c32b2d9744
+size 168

preprocessor_config.json CHANGED Viewed

@@ -1,33 +1,3 @@
-{
-  "do_convert_rgb": true,
-  "do_normalize": true,
-  "do_rescale": true,
-  "do_resize": true,
-  "image_mean": [
-    0.48145466,
-    0.4578275,
-    0.40821073
-  ],
-  "image_processor_type": "Qwen2VLImageProcessor",
-  "image_std": [
-    0.26862954,
-    0.26130258,
-    0.27577711
-  ],
-  "max_pixels": 602112,
-  "merge_size": 2,
-  "min_pixels": 3136,
-  "patch_size": 14,
-  "processor_class": "JinaEmbeddingsV4Processor",
-  "resample": 3,
-  "rescale_factor": 0.00392156862745098,
-  "video_processor_type": "Qwen2VLVideoProcessor",
-  "size": {
-    "longest_edge": 602112,
-    "shortest_edge": 3136
-  },
-  "temporal_patch_size": 2,
-  "auto_map": {
-    "AutoProcessor": "modeling_nova_embeddings_v1.NovaEmbeddingsV1Processor"
-  }
-}

+version https://git-lfs.github.com/spec/v1
+oid sha256:f4439f00d86669352d11321b568332bee8555b0b2a4bea1703e6d4b668810804
+size 726

results.json CHANGED Viewed

@@ -1,582 +1,3 @@
-{
-    "arxivqa_test_subsampled": {
-        "ndcg_at_1": 0.844,
-        "ndcg_at_3": 0.88524,
-        "ndcg_at_5": 0.88954,
-        "ndcg_at_10": 0.89512,
-        "ndcg_at_20": 0.90085,
-        "ndcg_at_50": 0.90479,
-        "ndcg_at_100": 0.90578,
-        "map_at_1": 0.844,
-        "map_at_3": 0.87467,
-        "map_at_5": 0.87717,
-        "map_at_10": 0.87933,
-        "map_at_20": 0.88099,
-        "map_at_50": 0.88161,
-        "map_at_100": 0.8817,
-        "recall_at_1": 0.844,
-        "recall_at_3": 0.916,
-        "recall_at_5": 0.926,
-        "recall_at_10": 0.944,
-        "recall_at_20": 0.966,
-        "recall_at_50": 0.986,
-        "recall_at_100": 0.992,
-        "precision_at_1": 0.844,
-        "precision_at_3": 0.30533,
-        "precision_at_5": 0.1852,
-        "precision_at_10": 0.0944,
-        "precision_at_20": 0.0483,
-        "precision_at_50": 0.01972,
-        "precision_at_100": 0.00992,
-        "mrr_at_1": 0.844,
-        "mrr_at_3": 0.8746666666666665,
-        "mrr_at_5": 0.8771666666666665,
-        "mrr_at_10": 0.8793301587301586,
-        "mrr_at_20": 0.880986183261183,
-        "mrr_at_50": 0.8816066058267283,
-        "mrr_at_100": 0.8816959272950264,
-        "naucs_at_1_max": 0.7413901379085128,
-        "naucs_at_1_std": 0.3454872013866209,
-        "naucs_at_1_diff1": 0.9600906830113787,
-        "naucs_at_3_max": 0.7713307545240329,
-        "naucs_at_3_std": 0.4801698457160663,
-        "naucs_at_3_diff1": 0.9489240140500664,
-        "naucs_at_5_max": 0.7514699573523106,
-        "naucs_at_5_std": 0.4375552022610836,
-        "naucs_at_5_diff1": 0.9526206879148043,
-        "naucs_at_10_max": 0.8086901427237575,
-        "naucs_at_10_std": 0.5144891289849284,
-        "naucs_at_10_diff1": 0.9513972255568919,
-        "naucs_at_20_max": 0.907453177349375,
-        "naucs_at_20_std": 0.5683802932937894,
-        "naucs_at_20_diff1": 0.9692425990003846,
-        "naucs_at_50_max": 0.8709483793517359,
-        "naucs_at_50_std": 0.7055488862211612,
-        "naucs_at_50_diff1": 0.9626517273576126,
-        "naucs_at_100_max": 0.8068394024276366,
-        "naucs_at_100_std": 0.7076330532212914,
-        "naucs_at_100_diff1": 0.9673202614378978
-    },
-    "docvqa_test_subsampled": {
-        "ndcg_at_1": 0.52328,
-        "ndcg_at_3": 0.5841,
-        "ndcg_at_5": 0.59975,
-        "ndcg_at_10": 0.62669,
-        "ndcg_at_20": 0.64245,
-        "ndcg_at_50": 0.65661,
-        "ndcg_at_100": 0.66492,
-        "map_at_1": 0.52328,
-        "map_at_3": 0.56911,
-        "map_at_5": 0.57786,
-        "map_at_10": 0.58881,
-        "map_at_20": 0.59317,
-        "map_at_50": 0.59548,
-        "map_at_100": 0.59622,
-        "recall_at_1": 0.52328,
-        "recall_at_3": 0.62749,
-        "recall_at_5": 0.66519,
-        "recall_at_10": 0.74945,
-        "recall_at_20": 0.81153,
-        "recall_at_50": 0.88248,
-        "recall_at_100": 0.93348,
-        "precision_at_1": 0.52328,
-        "precision_at_3": 0.20916,
-        "precision_at_5": 0.13304,
-        "precision_at_10": 0.07494,
-        "precision_at_20": 0.04058,
-        "precision_at_50": 0.01765,
-        "precision_at_100": 0.00933,
-        "mrr_at_1": 0.5232815964523282,
-        "mrr_at_3": 0.5691056910569108,
-        "mrr_at_5": 0.5778640059127865,
-        "mrr_at_10": 0.5888132193010243,
-        "mrr_at_20": 0.5931663069177401,
-        "mrr_at_50": 0.5954783504735428,
-        "mrr_at_100": 0.5962169799244146,
-        "naucs_at_1_max": 0.46089368028029637,
-        "naucs_at_1_std": 0.19359243300005127,
-        "naucs_at_1_diff1": 0.8483527783001977,
-        "naucs_at_3_max": 0.4640279399849662,
-        "naucs_at_3_std": 0.1814509120980464,
-        "naucs_at_3_diff1": 0.7719022256243834,
-        "naucs_at_5_max": 0.45716016762761796,
-        "naucs_at_5_std": 0.16428980258139747,
-        "naucs_at_5_diff1": 0.750196647594659,
-        "naucs_at_10_max": 0.3956528364820721,
-        "naucs_at_10_std": 0.09973122080056422,
-        "naucs_at_10_diff1": 0.7237863238311393,
-        "naucs_at_20_max": 0.35927664451426317,
-        "naucs_at_20_std": 0.09080366240903168,
-        "naucs_at_20_diff1": 0.6946736504983693,
-        "naucs_at_50_max": 0.3626447370884348,
-        "naucs_at_50_std": 0.2775120087087966,
-        "naucs_at_50_diff1": 0.6534710933108262,
-        "naucs_at_100_max": 0.32155287639122004,
-        "naucs_at_100_std": 0.3495021025151782,
-        "naucs_at_100_diff1": 0.6165810885563539
-    },
-    "infovqa_test_subsampled": {
-        "ndcg_at_1": 0.90283,
-        "ndcg_at_3": 0.93062,
-        "ndcg_at_5": 0.93567,
-        "ndcg_at_10": 0.93969,
-        "ndcg_at_20": 0.94324,
-        "ndcg_at_50": 0.94401,
-        "ndcg_at_100": 0.945,
-        "map_at_1": 0.90283,
-        "map_at_3": 0.92409,
-        "map_at_5": 0.92692,
-        "map_at_10": 0.92863,
-        "map_at_20": 0.92959,
-        "map_at_50": 0.9297,
-        "map_at_100": 0.92979,
-        "recall_at_1": 0.90283,
-        "recall_at_3": 0.94939,
-        "recall_at_5": 0.96154,
-        "recall_at_10": 0.97368,
-        "recall_at_20": 0.98785,
-        "recall_at_50": 0.9919,
-        "recall_at_100": 0.99798,
-        "precision_at_1": 0.90283,
-        "precision_at_3": 0.31646,
-        "precision_at_5": 0.19231,
-        "precision_at_10": 0.09737,
-        "precision_at_20": 0.04939,
-        "precision_at_50": 0.01984,
-        "precision_at_100": 0.00998,
-        "mrr_at_1": 0.902834008097166,
-        "mrr_at_3": 0.9240890688259108,
-        "mrr_at_5": 0.9269230769230767,
-        "mrr_at_10": 0.9286316753422016,
-        "mrr_at_20": 0.9295898610333593,
-        "mrr_at_50": 0.929699602843506,
-        "mrr_at_100": 0.929788457049907,
-        "naucs_at_1_max": 0.6026903076230651,
-        "naucs_at_1_std": 0.261936050485784,
-        "naucs_at_1_diff1": 0.9396804875719484,
-        "naucs_at_3_max": 0.7565375225904929,
-        "naucs_at_3_std": 0.45980620999702715,
-        "naucs_at_3_diff1": 0.9534218386220948,
-        "naucs_at_5_max": 0.8235249494008307,
-        "naucs_at_5_std": 0.5316999544043512,
-        "naucs_at_5_diff1": 0.9524604670358964,
-        "naucs_at_10_max": 0.8684766575602219,
-        "naucs_at_10_std": 0.5944713216706646,
-        "naucs_at_10_diff1": 0.9405654098266761,
-        "naucs_at_20_max": 0.7830887900175995,
-        "naucs_at_20_std": 0.5643438299512757,
-        "naucs_at_20_diff1": 0.8929919636352566,
-        "naucs_at_50_max": 0.7072835485426375,
-        "naucs_at_50_std": 0.5764614839135555,
-        "naucs_at_50_diff1": 0.8394879454528887,
-        "naucs_at_100_max": 1.0,
-        "naucs_at_100_std": 1.0,
-        "naucs_at_100_diff1": 1.0
-    },
-    "tabfquad_test_subsampled": {
-        "ndcg_at_1": 0.9,
-        "ndcg_at_3": 0.94685,
-        "ndcg_at_5": 0.95131,
-        "ndcg_at_10": 0.95366,
-        "ndcg_at_20": 0.95455,
-        "ndcg_at_50": 0.9553,
-        "ndcg_at_100": 0.9553,
-        "map_at_1": 0.9,
-        "map_at_3": 0.9369,
-        "map_at_5": 0.9394,
-        "map_at_10": 0.9404,
-        "map_at_20": 0.94063,
-        "map_at_50": 0.94077,
-        "map_at_100": 0.94077,
-        "recall_at_1": 0.9,
-        "recall_at_3": 0.975,
-        "recall_at_5": 0.98571,
-        "recall_at_10": 0.99286,
-        "recall_at_20": 0.99643,
-        "recall_at_50": 1.0,
-        "recall_at_100": 1.0,
-        "precision_at_1": 0.9,
-        "precision_at_3": 0.325,
-        "precision_at_5": 0.19714,
-        "precision_at_10": 0.09929,
-        "precision_at_20": 0.04982,
-        "precision_at_50": 0.02,
-        "precision_at_100": 0.01,
-        "mrr_at_1": 0.9,
-        "mrr_at_3": 0.936904761904762,
-        "mrr_at_5": 0.9394047619047617,
-        "mrr_at_10": 0.9403968253968255,
-        "mrr_at_20": 0.9406349206349207,
-        "mrr_at_50": 0.9407722832722833,
-        "mrr_at_100": 0.9407722832722833,
-        "naucs_at_1_max": 0.39284046952114193,
-        "naucs_at_1_std": 0.06274176337201544,
-        "naucs_at_1_diff1": 0.9321395224756563,
-        "naucs_at_3_max": 0.98132586367881,
-        "naucs_at_3_std": 0.9042950513538718,
-        "naucs_at_3_diff1": 0.98132586367881,
-        "naucs_at_5_max": 0.967320261437913,
-        "naucs_at_5_std": 0.8978758169934754,
-        "naucs_at_5_diff1": 1.0,
-        "naucs_at_10_max": 1.0,
-        "naucs_at_10_std": 0.9346405228758269,
-        "naucs_at_10_diff1": 1.0,
-        "naucs_at_20_max": 1.0,
-        "naucs_at_20_std": 1.0,
-        "naucs_at_20_diff1": 1.0,
-        "naucs_at_50_max": 1.0,
-        "naucs_at_50_std": 1.0,
-        "naucs_at_50_diff1": 1.0,
-        "naucs_at_100_max": 1.0,
-        "naucs_at_100_std": 1.0,
-        "naucs_at_100_diff1": 1.0
-    },
-    "tatdqa_test": {
-        "ndcg_at_1": 0.68834,
-        "ndcg_at_3": 0.7834,
-        "ndcg_at_5": 0.80344,
-        "ndcg_at_10": 0.81851,
-        "ndcg_at_20": 0.82469,
-        "ndcg_at_50": 0.82852,
-        "ndcg_at_100": 0.82981,
-        "map_at_1": 0.68834,
-        "map_at_3": 0.76073,
-        "map_at_5": 0.772,
-        "map_at_10": 0.7783,
-        "map_at_20": 0.78002,
-        "map_at_50": 0.78067,
-        "map_at_100": 0.78079,
-        "recall_at_1": 0.68834,
-        "recall_at_3": 0.84872,
-        "recall_at_5": 0.89672,
-        "recall_at_10": 0.94289,
-        "recall_at_20": 0.96719,
-        "recall_at_50": 0.98603,
-        "recall_at_100": 0.99392,
-        "precision_at_1": 0.68834,
-        "precision_at_3": 0.28291,
-        "precision_at_5": 0.17934,
-        "precision_at_10": 0.09429,
-        "precision_at_20": 0.04836,
-        "precision_at_50": 0.01972,
-        "precision_at_100": 0.00994,
-        "mrr_at_1": 0.6865127582017011,
-        "mrr_at_3": 0.7598217901984609,
-        "mrr_at_5": 0.7710307816929933,
-        "mrr_at_10": 0.7773322532739296,
-        "mrr_at_20": 0.7790656715075932,
-        "mrr_at_50": 0.7797137179788176,
-        "mrr_at_100": 0.7798294471430899,
-        "naucs_at_1_max": 0.19289339347399329,
-        "naucs_at_1_std": -0.05373436574034402,
-        "naucs_at_1_diff1": 0.8118815353915732,
-        "naucs_at_3_max": 0.24444248974914928,
-        "naucs_at_3_std": 0.012951438245694854,
-        "naucs_at_3_diff1": 0.7252009696977523,
-        "naucs_at_5_max": 0.27477480629269946,
-        "naucs_at_5_std": 0.10687833140288663,
-        "naucs_at_5_diff1": 0.7019146338300569,
-        "naucs_at_10_max": 0.23474834180340118,
-        "naucs_at_10_std": 0.13375117651376378,
-        "naucs_at_10_diff1": 0.6766342016471449,
-        "naucs_at_20_max": 0.3762582961131715,
-        "naucs_at_20_std": 0.29216428469292166,
-        "naucs_at_20_diff1": 0.6564671335087516,
-        "naucs_at_50_max": 0.4691053847445,
-        "naucs_at_50_std": 0.4359718488363951,
-        "naucs_at_50_diff1": 0.7152604718494652,
-        "naucs_at_100_max": 0.5259975902909616,
-        "naucs_at_100_std": 0.651086653120611,
-        "naucs_at_100_diff1": 0.7663843453532901
-    },
-    "shiftproject_test": {
-        "ndcg_at_1": 0.85,
-        "ndcg_at_3": 0.91917,
-        "ndcg_at_5": 0.92347,
-        "ndcg_at_10": 0.92949,
-        "ndcg_at_20": 0.92949,
-        "ndcg_at_50": 0.92949,
-        "ndcg_at_100": 0.92949,
-        "map_at_1": 0.85,
-        "map_at_3": 0.90167,
-        "map_at_5": 0.90417,
-        "map_at_10": 0.90639,
-        "map_at_20": 0.90639,
-        "map_at_50": 0.90639,
-        "map_at_100": 0.90639,
-        "recall_at_1": 0.85,
-        "recall_at_3": 0.97,
-        "recall_at_5": 0.98,
-        "recall_at_10": 1.0,
-        "recall_at_20": 1.0,
-        "recall_at_50": 1.0,
-        "recall_at_100": 1.0,
-        "precision_at_1": 0.85,
-        "precision_at_3": 0.32333,
-        "precision_at_5": 0.196,
-        "precision_at_10": 0.1,
-        "precision_at_20": 0.05,
-        "precision_at_50": 0.02,
-        "precision_at_100": 0.01,
-        "mrr_at_1": 0.85,
-        "mrr_at_3": 0.9016666666666666,
-        "mrr_at_5": 0.9041666666666666,
-        "mrr_at_10": 0.9063888888888889,
-        "mrr_at_20": 0.9063888888888889,
-        "mrr_at_50": 0.9063888888888889,
-        "mrr_at_100": 0.9063888888888889,
-        "naucs_at_1_max": 0.029189716889034732,
-        "naucs_at_1_std": -0.37507321835340074,
-        "naucs_at_1_diff1": 0.7931012040351454,
-        "naucs_at_3_max": 0.5589791472144446,
-        "naucs_at_3_std": 0.09056956115779448,
-        "naucs_at_3_diff1": 0.9564270152505466,
-        "naucs_at_5_max": 0.3384687208216692,
-        "naucs_at_5_std": -0.2987861811391239,
-        "naucs_at_5_diff1": 1.0,
-        "naucs_at_10_max": 1.0,
-        "naucs_at_10_std": 1.0,
-        "naucs_at_10_diff1": 1.0,
-        "naucs_at_20_max": 1.0,
-        "naucs_at_20_std": 1.0,
-        "naucs_at_20_diff1": 1.0,
-        "naucs_at_50_max": null,
-        "naucs_at_50_std": null,
-        "naucs_at_50_diff1": null,
-        "naucs_at_100_max": null,
-        "naucs_at_100_std": null,
-        "naucs_at_100_diff1": null
-    },
-    "syntheticDocQA_artificial_intelligence_test": {
-        "ndcg_at_1": 0.98,
-        "ndcg_at_3": 0.99262,
-        "ndcg_at_5": 0.99262,
-        "ndcg_at_10": 0.99262,
-        "ndcg_at_20": 0.99262,
-        "ndcg_at_50": 0.99262,
-        "ndcg_at_100": 0.99262,
-        "map_at_1": 0.98,
-        "map_at_3": 0.99,
-        "map_at_5": 0.99,
-        "map_at_10": 0.99,
-        "map_at_20": 0.99,
-        "map_at_50": 0.99,
-        "map_at_100": 0.99,
-        "recall_at_1": 0.98,
-        "recall_at_3": 1.0,
-        "recall_at_5": 1.0,
-        "recall_at_10": 1.0,
-        "recall_at_20": 1.0,
-        "recall_at_50": 1.0,
-        "recall_at_100": 1.0,
-        "precision_at_1": 0.98,
-        "precision_at_3": 0.33333,
-        "precision_at_5": 0.2,
-        "precision_at_10": 0.1,
-        "precision_at_20": 0.05,
-        "precision_at_50": 0.02,
-        "precision_at_100": 0.01,
-        "mrr_at_1": 0.98,
-        "mrr_at_3": 0.99,
-        "mrr_at_5": 0.99,
-        "mrr_at_10": 0.99,
-        "mrr_at_20": 0.99,
-        "mrr_at_50": 0.99,
-        "mrr_at_100": 0.99,
-        "naucs_at_1_max": 0.540149393090569,
-        "naucs_at_1_std": 0.3384687208216605,
-        "naucs_at_1_diff1": 0.9346405228758133,
-        "naucs_at_3_max": 1.0,
-        "naucs_at_3_std": 1.0,
-        "naucs_at_3_diff1": 1.0,
-        "naucs_at_5_max": 1.0,
-        "naucs_at_5_std": 1.0,
-        "naucs_at_5_diff1": 1.0,
-        "naucs_at_10_max": 1.0,
-        "naucs_at_10_std": 1.0,
-        "naucs_at_10_diff1": 1.0,
-        "naucs_at_20_max": 1.0,
-        "naucs_at_20_std": 1.0,
-        "naucs_at_20_diff1": 1.0,
-        "naucs_at_50_max": null,
-        "naucs_at_50_std": null,
-        "naucs_at_50_diff1": null,
-        "naucs_at_100_max": null,
-        "naucs_at_100_std": null,
-        "naucs_at_100_diff1": null
-    },
-    "syntheticDocQA_energy_test": {
-        "ndcg_at_1": 0.95,
-        "ndcg_at_3": 0.96762,
-        "ndcg_at_5": 0.96762,
-        "ndcg_at_10": 0.97118,
-        "ndcg_at_20": 0.97118,
-        "ndcg_at_50": 0.973,
-        "ndcg_at_100": 0.973,
-        "map_at_1": 0.95,
-        "map_at_3": 0.96333,
-        "map_at_5": 0.96333,
-        "map_at_10": 0.965,
-        "map_at_20": 0.965,
-        "map_at_50": 0.96523,
-        "map_at_100": 0.96523,
-        "recall_at_1": 0.95,
-        "recall_at_3": 0.98,
-        "recall_at_5": 0.98,
-        "recall_at_10": 0.99,
-        "recall_at_20": 0.99,
-        "recall_at_50": 1.0,
-        "recall_at_100": 1.0,
-        "precision_at_1": 0.95,
-        "precision_at_3": 0.32667,
-        "precision_at_5": 0.196,
-        "precision_at_10": 0.099,
-        "precision_at_20": 0.0495,
-        "precision_at_50": 0.02,
-        "precision_at_100": 0.01,
-        "mrr_at_1": 0.95,
-        "mrr_at_3": 0.9633333333333333,
-        "mrr_at_5": 0.9633333333333333,
-        "mrr_at_10": 0.965,
-        "mrr_at_20": 0.965,
-        "mrr_at_50": 0.9652272727272727,
-        "mrr_at_100": 0.9652272727272727,
-        "naucs_at_1_max": 0.42726423902894384,
-        "naucs_at_1_std": -0.4889822595704953,
-        "naucs_at_1_diff1": 1.0,
-        "naucs_at_3_max": 0.6136788048552655,
-        "naucs_at_3_std": -0.6909430438842241,
-        "naucs_at_3_diff1": 1.0,
-        "naucs_at_5_max": 0.6136788048552745,
-        "naucs_at_5_std": -0.690943043884218,
-        "naucs_at_5_diff1": 1.0,
-        "naucs_at_10_max": 0.8692810457516413,
-        "naucs_at_10_std": 0.35807656395891135,
-        "naucs_at_10_diff1": 1.0,
-        "naucs_at_20_max": 0.8692810457516413,
-        "naucs_at_20_std": 0.35807656395891135,
-        "naucs_at_20_diff1": 1.0,
-        "naucs_at_50_max": null,
-        "naucs_at_50_std": null,
-        "naucs_at_50_diff1": null,
-        "naucs_at_100_max": null,
-        "naucs_at_100_std": null,
-        "naucs_at_100_diff1": null
-    },
-    "syntheticDocQA_government_reports_test": {
-        "ndcg_at_1": 0.93,
-        "ndcg_at_3": 0.96524,
-        "ndcg_at_5": 0.96954,
-        "ndcg_at_10": 0.96954,
-        "ndcg_at_20": 0.96954,
-        "ndcg_at_50": 0.96954,
-        "ndcg_at_100": 0.96954,
-        "map_at_1": 0.93,
-        "map_at_3": 0.95667,
-        "map_at_5": 0.95917,
-        "map_at_10": 0.95917,
-        "map_at_20": 0.95917,
-        "map_at_50": 0.95917,
-        "map_at_100": 0.95917,
-        "recall_at_1": 0.93,
-        "recall_at_3": 0.99,
-        "recall_at_5": 1.0,
-        "recall_at_10": 1.0,
-        "recall_at_20": 1.0,
-        "recall_at_50": 1.0,
-        "recall_at_100": 1.0,
-        "precision_at_1": 0.93,
-        "precision_at_3": 0.33,
-        "precision_at_5": 0.2,
-        "precision_at_10": 0.1,
-        "precision_at_20": 0.05,
-        "precision_at_50": 0.02,
-        "precision_at_100": 0.01,
-        "mrr_at_1": 0.93,
-        "mrr_at_3": 0.9566666666666667,
-        "mrr_at_5": 0.9591666666666667,
-        "mrr_at_10": 0.9591666666666667,
-        "mrr_at_20": 0.9591666666666667,
-        "mrr_at_50": 0.9591666666666667,
-        "mrr_at_100": 0.9591666666666667,
-        "naucs_at_1_max": 0.6809390422835813,
-        "naucs_at_1_std": 0.5458850206749362,
-        "naucs_at_1_diff1": 0.9229691876750709,
-        "naucs_at_3_max": 1.0,
-        "naucs_at_3_std": 1.0,
-        "naucs_at_3_diff1": 1.0,
-        "naucs_at_5_max": 1.0,
-        "naucs_at_5_std": 1.0,
-        "naucs_at_5_diff1": 1.0,
-        "naucs_at_10_max": 1.0,
-        "naucs_at_10_std": 1.0,
-        "naucs_at_10_diff1": 1.0,
-        "naucs_at_20_max": 1.0,
-        "naucs_at_20_std": 1.0,
-        "naucs_at_20_diff1": 1.0,
-        "naucs_at_50_max": null,
-        "naucs_at_50_std": null,
-        "naucs_at_50_diff1": null,
-        "naucs_at_100_max": null,
-        "naucs_at_100_std": null,
-        "naucs_at_100_diff1": null
-    },
-    "syntheticDocQA_healthcare_industry_test": {
-        "ndcg_at_1": 0.96,
-        "ndcg_at_3": 0.98393,
-        "ndcg_at_5": 0.98393,
-        "ndcg_at_10": 0.98393,
-        "ndcg_at_20": 0.98393,
-        "ndcg_at_50": 0.98393,
-        "ndcg_at_100": 0.98393,
-        "map_at_1": 0.96,
-        "map_at_3": 0.97833,
-        "map_at_5": 0.97833,
-        "map_at_10": 0.97833,
-        "map_at_20": 0.97833,
-        "map_at_50": 0.97833,
-        "map_at_100": 0.97833,
-        "recall_at_1": 0.96,
-        "recall_at_3": 1.0,
-        "recall_at_5": 1.0,
-        "recall_at_10": 1.0,
-        "recall_at_20": 1.0,
-        "recall_at_50": 1.0,
-        "recall_at_100": 1.0,
-        "precision_at_1": 0.96,
-        "precision_at_3": 0.33333,
-        "precision_at_5": 0.2,
-        "precision_at_10": 0.1,
-        "precision_at_20": 0.05,
-        "precision_at_50": 0.02,
-        "precision_at_100": 0.01,
-        "mrr_at_1": 0.96,
-        "mrr_at_3": 0.9783333333333333,
-        "mrr_at_5": 0.9783333333333333,
-        "mrr_at_10": 0.9783333333333333,
-        "mrr_at_20": 0.9783333333333333,
-        "mrr_at_50": 0.9783333333333333,
-        "mrr_at_100": 0.9783333333333333,
-        "naucs_at_1_max": 0.7047152194211012,
-        "naucs_at_1_std": 0.32037815126050734,
-        "naucs_at_1_diff1": 1.0,
-        "naucs_at_3_max": 1.0,
-        "naucs_at_3_std": 1.0,
-        "naucs_at_3_diff1": 1.0,
-        "naucs_at_5_max": 1.0,
-        "naucs_at_5_std": 1.0,
-        "naucs_at_5_diff1": 1.0,
-        "naucs_at_10_max": 1.0,
-        "naucs_at_10_std": 1.0,
-        "naucs_at_10_diff1": 1.0,
-        "naucs_at_20_max": 1.0,
-        "naucs_at_20_std": 1.0,
-        "naucs_at_20_diff1": 1.0,
-        "naucs_at_50_max": null,
-        "naucs_at_50_std": null,
-        "naucs_at_50_diff1": null,
-        "naucs_at_100_max": null,
-        "naucs_at_100_std": null,
-        "naucs_at_100_diff1": null
-    }
-}

+version https://git-lfs.github.com/spec/v1
+oid sha256:304bc4d3d620c30be991b788fc0e5c3176495a404a6119d11dbc3c298d5ad575
+size 20325

special_tokens_map.json CHANGED Viewed

@@ -1,31 +1,3 @@
-{
-  "additional_special_tokens": [
-    "<|im_start|>",
-    "<|im_end|>",
-    "<|object_ref_start|>",
-    "<|object_ref_end|>",
-    "<|box_start|>",
-    "<|box_end|>",
-    "<|quad_start|>",
-    "<|quad_end|>",
-    "<|vision_start|>",
-    "<|vision_end|>",
-    "<|vision_pad|>",
-    "<|image_pad|>",
-    "<|video_pad|>"
-  ],
-  "eos_token": {
-    "content": "<|im_end|>",
-    "lstrip": false,
-    "normalized": false,
-    "rstrip": false,
-    "single_word": false
-  },
-  "pad_token": {
-    "content": "<|endoftext|>",
-    "lstrip": false,
-    "normalized": false,
-    "rstrip": false,
-    "single_word": false
-  }
-}

+version https://git-lfs.github.com/spec/v1
+oid sha256:76862e765266b85aa9459767e33cbaf13970f327a0e88d1c65846c2ddd3a1ecd
+size 613

tokenizer_config.json CHANGED Viewed

@@ -1,209 +1,3 @@
-{
-  "add_bos_token": false,
-  "add_prefix_space": false,
-  "added_tokens_decoder": {
-    "151643": {
-      "content": "<|endoftext|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": true
-    },
-    "151644": {
-      "content": "<|im_start|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": true
-    },
-    "151645": {
-      "content": "<|im_end|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": true
-    },
-    "151646": {
-      "content": "<|object_ref_start|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": true
-    },
-    "151647": {
-      "content": "<|object_ref_end|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": true
-    },
-    "151648": {
-      "content": "<|box_start|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": true
-    },
-    "151649": {
-      "content": "<|box_end|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": true
-    },
-    "151650": {
-      "content": "<|quad_start|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": true
-    },
-    "151651": {
-      "content": "<|quad_end|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": true
-    },
-    "151652": {
-      "content": "<|vision_start|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": true
-    },
-    "151653": {
-      "content": "<|vision_end|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": true
-    },
-    "151654": {
-      "content": "<|vision_pad|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": true
-    },
-    "151655": {
-      "content": "<|image_pad|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": true
-    },
-    "151656": {
-      "content": "<|video_pad|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": true
-    },
-    "151657": {
-      "content": "<tool_call>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "151658": {
-      "content": "</tool_call>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "151659": {
-      "content": "<|fim_prefix|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "151660": {
-      "content": "<|fim_middle|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "151661": {
-      "content": "<|fim_suffix|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "151662": {
-      "content": "<|fim_pad|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "151663": {
-      "content": "<|repo_name|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "151664": {
-      "content": "<|file_sep|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    }
-  },
-  "additional_special_tokens": [
-    "<|im_start|>",
-    "<|im_end|>",
-    "<|object_ref_start|>",
-    "<|object_ref_end|>",
-    "<|box_start|>",
-    "<|box_end|>",
-    "<|quad_start|>",
-    "<|quad_end|>",
-    "<|vision_start|>",
-    "<|vision_end|>",
-    "<|vision_pad|>",
-    "<|image_pad|>",
-    "<|video_pad|>"
-  ],
-  "bos_token": null,
-  "chat_template": "{%- if tools %}\n    {{- '<|im_start|>system\\n' }}\n    {%- if messages[0]['role'] == 'system' %}\n        {{- messages[0]['content'] }}\n    {%- else %}\n        {{- 'You are a helpful assistant.' }}\n    {%- endif %}\n    {{- \"\\n\\n# Tools\\n\\nYou may call one or more functions to assist with the user query.\\n\\nYou are provided with function signatures within <tools></tools> XML tags:\\n<tools>\" }}\n    {%- for tool in tools %}\n        {{- \"\\n\" }}\n        {{- tool | tojson }}\n    {%- endfor %}\n    {{- \"\\n</tools>\\n\\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\\n<tool_call>\\n{\\\"name\\\": <function-name>, \\\"arguments\\\": <args-json-object>}\\n</tool_call><|im_end|>\\n\" }}\n{%- else %}\n    {%- if messages[0]['role'] == 'system' %}\n        {{- '<|im_start|>system\\n' + messages[0]['content'] + '<|im_end|>\\n' }}\n    {%- else %}\n        {{- '<|im_start|>system\\nYou are a helpful assistant.<|im_end|>\\n' }}\n    {%- endif %}\n{%- endif %}\n{%- for message in messages %}\n    {%- if (message.role == \"user\") or (message.role == \"system\" and not loop.first) or (message.role == \"assistant\" and not message.tool_calls) %}\n        {{- '<|im_start|>' + message.role + '\\n' + message.content + '<|im_end|>' + '\\n' }}\n    {%- elif message.role == \"assistant\" %}\n        {{- '<|im_start|>' + message.role }}\n        {%- if message.content %}\n            {{- '\\n' + message.content }}\n        {%- endif %}\n        {%- for tool_call in message.tool_calls %}\n            {%- if tool_call.function is defined %}\n                {%- set tool_call = tool_call.function %}\n            {%- endif %}\n            {{- '\\n<tool_call>\\n{\"name\": \"' }}\n            {{- tool_call.name }}\n            {{- '\", \"arguments\": ' }}\n            {{- tool_call.arguments | tojson }}\n            {{- '}\\n</tool_call>' }}\n        {%- endfor %}\n        {{- '<|im_end|>\\n' }}\n    {%- elif message.role == \"tool\" %}\n        {%- if (loop.index0 == 0) or (messages[loop.index0 - 1].role != \"tool\") %}\n            {{- '<|im_start|>user' }}\n        {%- endif %}\n        {{- '\\n<tool_response>\\n' }}\n        {{- message.content }}\n        {{- '\\n</tool_response>' }}\n        {%- if loop.last or (messages[loop.index0 + 1].role != \"tool\") %}\n            {{- '<|im_end|>\\n' }}\n        {%- endif %}\n    {%- endif %}\n{%- endfor %}\n{%- if add_generation_prompt %}\n    {{- '<|im_start|>assistant\\n' }}\n{%- endif %}\n",
-  "clean_up_tokenization_spaces": false,
-  "eos_token": "<|im_end|>",
-  "errors": "replace",
-  "extra_special_tokens": {},
-  "model_max_length": 131072,
-  "pad_token": "<|endoftext|>",
-  "processor_class": "JinaEmbeddingsV4Processor",
-  "split_special_tokens": false,
-  "tokenizer_class": "Qwen2Tokenizer",
-  "unk_token": null
-}

+version https://git-lfs.github.com/spec/v1
+oid sha256:13d28527663126ad9ab8a34aa6a4028b3f0b25f100defec89ee90b442d368dde
+size 7306

vocab.json CHANGED Viewed

The diff for this file is too large to render. See raw diff