Phi-4-mini N3 Transform to Knowledge Graph Fine-tune

This model is a fine-tuned version of microsoft/Phi-4-mini-instruct optimized for transforming entity and schema information into JSON-LD format, trained as part of the WIM (Wikipedia to Knowledge Graph) pipeline.

Model Details

Model Description

Developed by: UWV InnovatieHub
Model type: Causal Language Model with LoRA fine-tuning
Language(s): Dutch (nl)
License: MIT
Finetuned from: microsoft/Phi-4-mini-instruct (3.82B parameters)
Training Framework: Unsloth (optimized training for extreme context lengths)

Training Details

Dataset: UWV/wim-instruct-wiki-to-jsonld-agent-steps
Dataset Size: 10,593 N3-specific examples (JSON-LD transformation tasks)
Training Duration: 41 hours 54 minutes
Hardware: NVIDIA A100 80GB
Context Length: 131,072 tokens (128K)
Steps: 1,000
Training Metrics:
- Final Training Loss: 0.11
- Final Eval Loss: 0.119
- Trainable Parameters: ~178M (4.4% of model)

LoRA Configuration

{
    "r": 320,                    # Rank (Microsoft's recommended config)
    "lora_alpha": 320,          # Alpha (1:1 ratio for Phi-4)
    "lora_dropout": 0.0,        # No dropout
    "bias": "none",
    "task_type": "CAUSAL_LM",
    "target_modules": [
        "q_proj", "k_proj", "v_proj", "o_proj",
        "gate_proj", "up_proj", "down_proj"
    ]
}

Training Configuration

{
    "model": "phi4-mini",
    "max_seq_length": 131072,    # 128K context
    "batch_size": 1,
    "gradient_accumulation_steps": 8,
    "effective_batch_size": 8,
    "learning_rate": 1e-5,
    "warmup_steps": 20,
    "max_grad_norm": 1.0,
    "lr_scheduler": "linear",
    "optimizer": "paged_adamw_8bit",
    "bf16": True,
    "gradient_checkpointing": True,
    "seed": 42
}

Intended Uses & Limitations

Intended Uses

JSON-LD Generation: Transform entity and schema information into valid JSON-LD format
Knowledge Graph Construction: Third step (N3) in the WIM pipeline
Structured Data Creation: Convert unstructured entity descriptions to Schema.org-compliant JSON-LD
Long Context Processing: Handle extremely long input sequences (up to 128K tokens)

Limitations

Requires extensive context (average input ~40K tokens)
Memory intensive due to long sequences
Best performance with Phi-4's specific prompt format
May require post-processing validation (N4 step)

How to Use

Option 1: Using the Merged Model (Recommended)

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
import json

# Load the merged model (ready to use)
model = AutoModelForCausalLM.from_pretrained(
    "UWV/wim-n3-phi4-mini-merged",  # Update with actual repo
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained("UWV/wim-n3-phi4-mini-merged")

# Prepare input (typically very long with entity and schema information)
entities = [
    {"name": "Amsterdam", "type": "City"},
    {"name": "Netherlands", "type": "Country"}
]
schemas = {
    "City": "https://schema.org/City",
    "Country": "https://schema.org/Country"
}

messages = [
    {
        "role": "system", 
        "content": "You are an expert in creating JSON-LD representations using Schema.org vocabulary."
    },
    {
        "role": "user", 
        "content": f"""Transform the following entities into JSON-LD format using Schema.org:

Entities: {json.dumps(entities, ensure_ascii=False)}
Schemas: {json.dumps(schemas, ensure_ascii=False)}

Create a complete JSON-LD representation with proper @context and @type declarations."""
    }
]

# Apply chat template and generate
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt", truncation=True, max_length=131072)
inputs = {k: v.to(model.device) for k, v in inputs.items()}

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=4096,  # JSON-LD can be long
        temperature=0.1,      # Low temperature for valid JSON
        do_sample=True,
        top_p=0.95,
        pad_token_id=tokenizer.pad_token_id,
        eos_token_id=tokenizer.eos_token_id,
    )

# Decode and parse response
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
if "assistant:" in response:
    json_ld = response.split("assistant:")[-1].strip()

print(json_ld)

Option 2: Using the LoRA Adapter

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    "microsoft/Phi-4-mini-instruct",
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True
)

# Load adapter
model = PeftModel.from_pretrained(
    base_model,
    "UWV/wim-n3-phi4-mini-adapter"  # Update with actual repo
)
tokenizer = AutoTokenizer.from_pretrained("UWV/wim-n3-phi4-mini-adapter")

# Use same inference code as above...

Expected Output Format

The model outputs valid JSON-LD with Schema.org vocabulary:

{
  "@context": "https://schema.org",
  "@graph": [
    {
      "@type": "City",
      "@id": "_:amsterdam",
      "name": "Amsterdam",
      "containedInPlace": {
        "@id": "_:netherlands"
      }
    },
    {
      "@type": "Country",
      "@id": "_:netherlands",
      "name": "Netherlands"
    }
  ]
}

Dataset Information

The model was trained on the UWV/wim-instruct-wiki-to-jsonld-agent-steps dataset, which contains:

Source: Dutch Wikipedia articles processed through N1 and N2 steps
Processing: Multi-agent pipeline converting text to JSON-LD
N3 Examples: 10,593 transformation tasks
Average Token Length: ~40,388 tokens (extremely long sequences)
Max Token Length: 520,575 tokens
Format: ChatML-formatted instruction-following examples
Task: Transform entity and schema information into valid JSON-LD

Training Results

The model achieved exceptional performance with minimal overfitting:

Final Loss: 0.11 (excellent convergence)
Eval Loss: 0.119 (very close to training loss)
Loss Ratio: 0.92 (indicating good generalization)

This was achieved despite the extreme context lengths and complex transformation task.

Model Versions

Merged Model: UWV/wim-n3-phi4-mini-merged (681MB adapter + base model)
- Ready to use without adapter loading
- Recommended for production inference
LoRA Adapter: UWV/wim-n3-phi4-mini-adapter (681MB)
- Requires base Phi-4-mini-instruct model
- More flexible for further fine-tuning

Pipeline Context

This model is part of the WIM (Wikipedia to Knowledge Graph) pipeline:

N1: Entity Extraction
N2: Schema.org Type Selection
N3 (This Model): Transform to JSON-LD
N4: Validation
N5: Add Human-Readable Labels

N3 is the most computationally intensive step, handling the complex transformation from structured entity information to valid JSON-LD format.

Technical Notes

Memory Requirements: ~53GB VRAM for 128K context inference
Optimization: Uses Unsloth's custom kernels for efficient long-context processing
Special Configuration: Requires TORCH_COMPILE_DISABLE=1 for Phi-4 compatibility
Context Handling: Can process full Wikipedia articles with extensive entity information

Citation

If you use this model, please cite:

@misc{wim-n3-phi4-mini,
  author = {UWV InnovatieHub},
  title = {Phi-4-mini N3 Transform to JSON-LD Model},
  year = {2025},
  publisher = {HuggingFace},
  url = {https://huggingface.co/UWV/wim-n3-phi4-mini-merged}
}

UWV
/

wim-n3-phi4-mini-adapter