--- license: other license_name: qwen license_link: https://huggingface.co/skt/A.X-4.0/blob/main/LICENSE language: - en - ko pipeline_tag: text-generation library_name: transformers model_id: skt/A.X-4.0 developers: SKT AI Model Lab model-index: - name: A.X-4.0 results: - task: type: generate_until name: mmlu dataset: name: mmlu (chat CoT) type: hails/mmlu_no_train metrics: - type: exact_match value: 86.62 name: exact_match - task: type: generate_until name: kmmlu dataset: name: kmmlu (chat CoT) type: HAERAE-HUB/KMMLU metrics: - type: exact_match value: 78.32 name: exact_match --- # A.X 4.0

πŸ€— Models | πŸ’¬ Chat | πŸ“¬ APIs (FREE!) | πŸ–₯️ Github

## A.X 4.0 Family Highlights SK Telecom released **A.X 4.0** (pronounced "A dot X"), a large language model (LLM) optimized for Korean-language understanding and enterprise deployment, on July 03, 2025. Built on the open-source [Qwen2.5](https://huggingface.co/collections/Qwen/qwen25-66e81a666513e518adb90d9e) model, A.X 4.0 has been further trained with large-scale Korean datasets to deliver outstanding performance in real-world business environments. - **Superior Korean Proficiency**: Achieved a score of 78.3 on [KMMLU](https://huggingface.co/datasets/HAERAE-HUB/KMMLU), the leading benchmark for Korean-language evaluation and a Korean-specific adaptation of MMLU, outperforming GPT-4o (72.5). - **Deep Cultural Understanding**: Scored 83.5 on [CLIcK](https://huggingface.co/datasets/EunsuKim/CLIcK), a benchmark for Korean cultural and contextual comprehension, surpassing GPT-4o (80.2). - **Efficient Token Usage**: A.X 4.0 uses approximately 33% fewer tokens than GPT-4o for the same Korean input, enabling more cost-effective and efficient processing. - **Deployment Flexibility**: Offered in both a 72B-parameter standard model (A.X 4.0) and a 7B lightweight version (A.X 4.0 Light). - **Long Context Handling**: Supports up to 131,072 tokens, allowing comprehension of lengthy documents and conversations. (Lightweight model supports up to 16,384 tokens length) ## Performance ### Model Performance
Benchmarks A.X 4.0 Qwen3-235B-A22B
(w/o reasoning)
Qwen2.5-72B GPT-4o
Knowledge KMMLU 78.32 73.64 66.44 72.51
CLIcK 83.51 74.55 72.59 80.22
KoBALT 47.30 41.57 37.00 44.00
MMLU 86.62 87.37 85.70 88.70
General Ko-MT-Bench 86.69 88.00 82.69 88.44
MT-Bench 83.25 86.56 93.50 88.19
LiveBench2024.11 52.30 64.50 54.20 52.19
Instruction Following Ko-IFEval 77.96 77.53 77.07 75.38
IFEval 86.05 85.77 86.54 83.86
Math HRM8K 48.55 54.52 46.37 43.27
MATH 74.28 72.72 77.00 72.38
Code HumanEval+ 79.27 79.27 81.71 86.00
MBPP+ 73.28 70.11 75.66 75.10
LiveCodeBench2024.10~2025.04 26.07 33.09 27.58 29.30
Long Context LongBench<128K 56.70 49.40 45.60 47.50
Tool-use FunctionChatBench 85.96 82.43 88.30 95.70
### Lightweight Model Performance
Benchmarks A.X 4.0 Light Qwen3-8B
(w/o reasoning)
Qwen2.5-7B EXAONE-3.5-7.8B Kanana-1.5-8B
Knowledge KMMLU 64.15 63.53 49.56 53.76 48.28
CLIcK 68.05 62.71 60.56 64.30 61.30
KoBALT 30.29 26.57 21.57 21.71 23.14
MMLU 75.43 82.89 75.40 72.20 68.82
General Ko-MT-Bench 79.50 64.06 61.31 81.06 76.30
MT-Bench 81.56 65.69 79.37 83.50 77.60
LiveBench 37.10 50.20 37.00 40.20 29.40
Instruction Following Ko-IFEval 72.99 73.39 60.73 65.01 69.96
IFEval 84.68 85.38 76.73 82.61 80.11
Math HRM8K 40.12 52.50 35.13 31.88 30.87
MATH 68.88 71.48 65.58 63.20 59.28
Code HumanEval+ 75.61 77.44 74.39 76.83 76.83
MBPP+ 67.20 62.17 68.50 64.29 67.99
LiveCodeBench 18.03 23.93 16.62 17.98 16.52
## πŸš€ Quickstart ### with HuggingFace Transformers - `transformers>=4.46.0` or the latest version is required to use `skt/A.X-4.0` ```bash pip install transformers>=4.46.0 ``` #### Example Usage ```python import torch from transformers import AutoModelForCausalLM, AutoTokenizer model_name = "skt/A.X-4.0" model = AutoModelForCausalLM.from_pretrained( model_name, torch_dtype=torch.bfloat16, device_map="auto", ) model.eval() tokenizer = AutoTokenizer.from_pretrained(model_name) messages = [ {"role": "system", "content": "당신은 μ‚¬μš©μžκ°€ μ œκ³΅ν•˜λŠ” μ˜μ–΄ λ¬Έμž₯듀을 ν•œκ΅­μ–΄λ‘œ λ²ˆμ—­ν•˜λŠ” AI μ „λ¬Έκ°€μž…λ‹ˆλ‹€."}, {"role": "user", "content": "The first human went into space and orbited the Earth on April 12, 1961."}, ] input_ids = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device) with torch.no_grad(): output = model.generate( input_ids, max_new_tokens=128, do_sample=False, ) len_input_prompt = len(input_ids[0]) response = tokenizer.decode(output[0][len_input_prompt:], skip_special_tokens=True) print(response) # Output: # 졜초의 인간이 1961λ…„ 4μ›” 12일에 우주둜 κ°€μ„œ 지ꡬ ꢀ도λ₯Ό λŒμ•˜μŠ΅λ‹ˆλ‹€. ``` ### with vLLM - `vllm>=v0.6.4.post1` or the latest version is required to use tool-use function ```bash pip install vllm>=v0.6.4.post1 # if you don't want to activate tool-use function, just commenting out below vLLM option VLLM_OPTION="--enable-auto-tool-choice --tool-call-parser hermes" vllm serve skt/A.X-4.0 $VLLM_OPTION ``` #### Example Usage ```python from openai import OpenAI def call(messages, model): completion = client.chat.completions.create( model=model, messages=messages, ) print(completion.choices[0].message) client = OpenAI( base_url="http://localhost:8000/v1", api_key="api_key" ) model = "skt/A.X-4.0" messages = [{"role": "user", "content": "에어컨 여름철 적정 μ˜¨λ„λŠ”? ν•œμ€„λ‘œ λ‹΅λ³€ν•΄μ€˜"}] call(messages, model) # Output: # ChatCompletionMessage(content='여름철 에어컨 적정 μ˜¨λ„λŠ” 24~26λ„μž…λ‹ˆλ‹€.', refusal=None, role='assistant', audio=None, function_call=None, tool_calls=[], reasoning_content=None) messages = [{"role": "user", "content": "What is the appropriate temperature for air conditioning in summer? Response in a single sentence."}] call(messages, model) # Output: # ChatCompletionMessage(content='A comfortable and energy-efficient temperature for air conditioning in summer is typically around 24-26Β°C (75-79Β°F).', refusal=None, role='assistant', audio=None, function_call=None, tool_calls=[], reasoning_content=None) ``` #### Examples for tool-use ```python from openai import OpenAI def call(messages, model): completion = client.chat.completions.create( model=model, messages=messages, tools=tools ) print(completion.choices[0].message) client = OpenAI( base_url="http://localhost:8000/v1", api_key="api_key" ) model = "skt/A.X-4.0" calculate_discount = { "type": "function", "function": { "name": "calculate_discount", "description": "원가격과 ν• μΈμœ¨(νΌμ„ΌνŠΈ λ‹¨μœ„)을 μž…λ ₯λ°›μ•„ ν• μΈλœ κ°€κ²©μ„κ³„μ‚°ν•œλ‹€.", "parameters": { "type": "object", "properties": { "original_price": { "type": "number", "description": "μƒν’ˆμ˜ μ›λž˜ 가격" }, "discount_percentage": { "type": "number", "description": "μ μš©ν•  ν• μΈμœ¨(예: 20% ν• μΈμ˜ 경우 20을 μž…λ ₯)" } }, "required": ["original_price", "discount_percentage"] } } } get_exchange_rate = { "type": "function", "function": { "name": "get_exchange_rate", "description": "두 톡화 κ°„μ˜ ν™˜μœ¨μ„ κ°€μ Έμ˜¨λ‹€.", "parameters": { "type": "object", "properties": { "base_currency": { "type": "string", "description": "The currency to convert from." }, "target_currency": { "type": "string", "description": "The currency to convert to." } }, "required": ["base_currency", "target_currency"] } } } tools = [calculate_discount, get_exchange_rate] ### Slot filling ### messages = [{"role": "user", "content": "μš°λ¦¬κ°€ 뭘 μ‚¬μ•Όλ˜λŠ”λ° μ›λž˜ 57600원인데 직원할인 받을 수 μžˆκ±°λ“ ? 할인가쒀 κ³„μ‚°ν•΄μ€˜"}] call(messages, model) # Output: # ChatCompletionMessage(content='직원 할인을 λͺ‡ νΌμ„ΌνŠΈ 받을 수 μžˆλŠ”μ§€ μ•Œλ €μ£Όμ‹œκ² μ–΄μš”?', refusal=None, role='assistant', audio=None, function_call=None, tool_calls=[], reasoning_content=None) ### Function calling ### messages = [ {"role": "user", "content": "μš°λ¦¬κ°€ 뭘 μ‚¬μ•Όλ˜λŠ”λ° μ›λž˜ 57600원인데 직원할인 받을 수 μžˆκ±°λ“ ? 할인가쒀 κ³„μ‚°ν•΄μ€˜"}, {"role": "assistant", "content": "직원 할인을 λͺ‡ νΌμ„ΌνŠΈ 받을 수 μžˆλŠ”μ§€ μ•Œλ €μ£Όμ‹œκ² μ–΄μš”?"}, {"role": "user", "content": "15% 할인 받을 수 μžˆμ–΄."}, ] call(messages, model) # Output: # ChatCompletionMessage(content=None, refusal=None, role='assistant', audio=None, function_call=None, tool_calls=[ChatCompletionMessageToolCall(id='chatcmpl-tool-8634f423e5494ebfa7a428b8385b4981', function=Function(arguments='{"original_price": 57600, "discount_percentage": 15}', name='calculate_discount'), type='function')], reasoning_content=None) ### Completion ### messages = [ {"role": "user", "content": "μš°λ¦¬κ°€ 뭘 μ‚¬μ•Όλ˜λŠ”λ° μ›λž˜ 57600원인데 직원할인 받을 수 μžˆκ±°λ“ ? 할인가쒀 κ³„μ‚°ν•΄μ€˜"}, {"role": "assistant", "content": "직원 할인을 λͺ‡ νΌμ„ΌνŠΈ 받을 수 μžˆλŠ”μ§€ μ•Œλ €μ£Όμ‹œκ² μ–΄μš”?"}, {"role": "user", "content": "15% 할인 받을 수 μžˆμ–΄."}, {"role": "tool", "tool_call_id": "random_id", "name": "calculate_discount", "content": "{\"original_price\": 57600, \"discount_percentage\": 15, \"discounted_price\": 48960.0}"} ] call(messages, model) # Output: # ChatCompletionMessage(content='15% 할인을 μ μš©ν•˜λ©΄ μƒν’ˆμ˜ 가격은 48,960원이 λ©λ‹ˆλ‹€.', refusal=None, role='assistant', audio=None, function_call=None, tool_calls=[], reasoning_content=None) ``` ## Citation ``` @article{SKTAdotX4, title={A.X 4.0}, author={SKT AI Model Lab}, year={2025}, url={https://huggingface.co/skt/A.X-4.0} } ``` ## Contact - Business & Partnership Contact: [a.x@sk.com](a.x@sk.com)