Upload folder using huggingface_hub
Browse files
nemotron_fineinstructions_1T_exp_chat_sft/hf/README.md
CHANGED
@@ -10,7 +10,7 @@ tokenizer.padding_side = 'left'
|
|
10 |
model = AutoModelForCausalLM.from_pretrained('/mnt/nlpgpu-io1/data/ajayp/output/fineinstructions/dated/2025-09-03-14:45:23/data/sft_v4_fineinstructions-pretraining_experiments-nemotron_fineinstructions_1T_exp_chat-hf/export_path', revision=None) # Load model
|
11 |
pipe = pipeline('text-generation', model=model, tokenizer=tokenizer, pad_token_id=tokenizer.pad_token_id, return_full_text=False)
|
12 |
|
13 |
-
inputs = ["Instruction:
|
14 |
print(pipe(inputs, max_length=8192, do_sample=False))
|
15 |
```
|
16 |
|
|
|
10 |
model = AutoModelForCausalLM.from_pretrained('/mnt/nlpgpu-io1/data/ajayp/output/fineinstructions/dated/2025-09-03-14:45:23/data/sft_v4_fineinstructions-pretraining_experiments-nemotron_fineinstructions_1T_exp_chat-hf/export_path', revision=None) # Load model
|
11 |
pipe = pipeline('text-generation', model=model, tokenizer=tokenizer, pad_token_id=tokenizer.pad_token_id, return_full_text=False)
|
12 |
|
13 |
+
inputs = ["Instruction: If 5 machines can produce 5 widgets in 5 minutes, how long will it take 100 machines to produce 100 widgets?\n\nAnswer:From the initial statement, we can deduce that one machine takes 5 minutes to produce one widget. This is because the rate of production is constant across all machines and all widgets.\n\nGiven this, if we had 100 machines working simultaneously, they could also each produce one widget in 5 minutes. This is because the machines work independently and do not affect each other's performance.\n\nSo, even with 100 machines producing 100 widgets, it would still only take 5 minutes to produce all the widgets.\n\nInstruction: Given the constant rate of production, what factors could potentially decrease the efficiency of these machines over time?\n\nAnswer:Several factors could potentially decrease the efficiency of these machines over time:\n\n1. Wear and Tear: Over time, machines can degrade due to constant use, leading to slower production times or more frequent breakdowns. Regular maintenance can help mitigate this, but not completely prevent it.\n\n2. Lack of Maintenance: Regular maintenance is crucial for keeping machines running efficiently. If maintenance is neglected, the machines may not operate at their full capacity.\n\n3. Outdated Technology: As technology advances, older machines may become less efficient compared to newer models. Upgrading to newer machines can help maintain efficiency.\n\n4. Power Supply Issues: If there are fluctuations in the power supply, it could affect the machines' operation and slow down production.\n\n5. Environmental Factors: Changes in the environment, such as temperature, humidity, or dust, can affect a machine's performance. \n\n6. Operator Error: If the machines are not used correctly, it could lead to decreased efficiency. Proper training of operators can help avoid this.\n\n7. Supply Chain Disruptions: If there are issues with the supply of raw materials, it could lead to downtime and decreased efficiency. \n\n8. Software Glitches: If the machines are controlled by software, any bugs or glitches could affect their performance.\n\n9. Unexpected Breakdowns: Unforeseen mechanical or electrical failures can cause downtime and decrease overall efficiency.\n\n10. Regulatory Changes: Changes in regulations or standards can require modifications to the machines, which could temporarily decrease efficiency.\n\nInstruction: Considering the factors that could decrease the efficiency of the machines, how would you devise a comprehensive maintenance plan to mitigate these issues, while also factoring in potential costs and downtime?\n\nAnswer:"]
|
14 |
print(pipe(inputs, max_length=8192, do_sample=False))
|
15 |
```
|
16 |
|
nemotron_fineinstructions_1T_exp_chat_sft/hf/config.json
CHANGED
@@ -1,5 +1,5 @@
|
|
1 |
{
|
2 |
-
"_name_or_path": "/nlpgpu/data/ajayp/.cache/huggingface_cache/hub/models--fineinstructions--pretraining_experiments/snapshots/
|
3 |
"architectures": [
|
4 |
"LlamaForCausalLM"
|
5 |
],
|
|
|
1 |
{
|
2 |
+
"_name_or_path": "/nlpgpu/data/ajayp/.cache/huggingface_cache/hub/models--fineinstructions--pretraining_experiments/snapshots/ddf11e491a13030968737926401b8e4a900e3bb4/nemotron_fineinstructions_1T_exp_chat/hf",
|
3 |
"architectures": [
|
4 |
"LlamaForCausalLM"
|
5 |
],
|
nemotron_fineinstructions_1T_exp_chat_sft/hf/model.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 3619919680
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:aef4e287f572bc253d7b47f29062eae23fbf8ae7eadae455934968908916e69c
|
3 |
size 3619919680
|
nemotron_fineinstructions_1T_exp_chat_sft/hf/training_args.json
CHANGED
@@ -14,7 +14,7 @@
|
|
14 |
"eval_accumulation_steps": 1,
|
15 |
"eval_delay": 0,
|
16 |
"torch_empty_cache_steps": null,
|
17 |
-
"learning_rate":
|
18 |
"weight_decay": 0.01,
|
19 |
"adam_beta1": 0.9,
|
20 |
"adam_beta2": 0.999,
|
@@ -29,7 +29,7 @@
|
|
29 |
"log_level": "passive",
|
30 |
"log_level_replica": "warning",
|
31 |
"log_on_each_node": true,
|
32 |
-
"logging_dir": "/mnt/nlpgpu-io1/data/ajayp/output/fineinstructions/dated/2025-09-03-14:45:23/data/sft_v4_fineinstructions-pretraining_experiments-nemotron_fineinstructions_1T_exp_chat-hf/post-train-sft/_checkpoints/runs/
|
33 |
"logging_strategy": "steps",
|
34 |
"logging_first_step": false,
|
35 |
"logging_steps": 1,
|
|
|
14 |
"eval_accumulation_steps": 1,
|
15 |
"eval_delay": 0,
|
16 |
"torch_empty_cache_steps": null,
|
17 |
+
"learning_rate": 0.0001,
|
18 |
"weight_decay": 0.01,
|
19 |
"adam_beta1": 0.9,
|
20 |
"adam_beta2": 0.999,
|
|
|
29 |
"log_level": "passive",
|
30 |
"log_level_replica": "warning",
|
31 |
"log_on_each_node": true,
|
32 |
+
"logging_dir": "/mnt/nlpgpu-io1/data/ajayp/output/fineinstructions/dated/2025-09-03-14:45:23/data/sft_v4_fineinstructions-pretraining_experiments-nemotron_fineinstructions_1T_exp_chat-hf/post-train-sft/_checkpoints/runs/Sep06_09-02-31_nlpgpu06.seas.upenn.edu",
|
33 |
"logging_strategy": "steps",
|
34 |
"logging_first_step": false,
|
35 |
"logging_steps": 1,
|