--- base_model: - deepseek-ai/DeepSeek-V3.1-Base base_model_relation: quantized --- ## Q4_K_M static quant of deepseek-ai/DeepSeek-V3.1-Base Using llama.cpp release b6182 for quantization. Original model: https://huggingface.co/deepseek-ai/DeepSeek-V3.1-Base Uploading this since I'm using it to calculate imatrix, figured might as well provide it in the meantime Remember, this is a **BASE** model, so it likely will not chat properly unless you give it multiple turns of examples, for instance I've had success with: ``` ./llama-cli -m /models/deepseek-ai_DeepSeek-V3.1-Base-Q4_K_M-00001-of-00011.gguf -p "You are a helpful assistant.Hello, who are you?I am DeepSeek, a helpful AI assistant.How are you today?I'm doing well! Is there anything I can assist you with?Can you explain the laws of thermodynamics?" -no-cnv -ngl 0 --reverse-prompt "" ``` Prompt for easier viewing: `You are a helpful assistant.Hello, who are you?I am DeepSeek, a helpful AI assistant.How are you today?I'm doing well! Is there anything I can assist you with?Can you explain the laws of thermodynamics?" -no-cnv -ngl 0 --reverse-prompt ""` *Yes*, I am using `` and `` as opposed to the special tokens `<｜User｜>` and `<｜Assistant｜>`, for some reason this seems to be more stable? This resulted in a completely coherent reply: > Sure, here's a brief explanation of the laws of thermodynamics: 1. Zeroth Law of Thermodynamics: If two thermodynamic systems are each in thermal equilibrium with a third system, then they are in thermal equilibrium with each other. 2. First Law of Thermodynamics: The total energy of an isolated system is constant; energy can be transformed from one form to another, but cannot be created or destroyed. 3. Second Law of Thermodynamics: The entropy of an isolated system not in equilibrium will tend to increase over time, approaching a maximum value at equilibrium. 4. Third Law of Thermodynamics: As the temperature of a system approaches absolute zero, the entropy of the system approaches a minimum value. Would you like more details on any of these laws? The idea is that you need to teach the base model what a conversation looks like first, base models aren't usually capable of one-shotting a conversation since it hasn't been tuned to understand roles. 382G total size