-
Aligning Instruction Tuning with Pre-training
Paper • 2501.09368 • Published -
Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey
Paper • 2403.14608 • Published -
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Paper • 2305.18290 • Published • 63
ROHITH VENKATA REDDY
knight7561
AI & ML interests
Deep learning, Autonomous Driving
Organizations
LLM and Reasoning Papers
Papers dump of LLM Reasoning domain
-
Internal Consistency and Self-Feedback in Large Language Models: A Survey
Paper • 2407.14507 • Published • 46 -
Large Language Models are Zero-Shot Reasoners
Paper • 2205.11916 • Published • 2 -
Let's Verify Step by Step
Paper • 2305.20050 • Published • 11 -
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Paper • 2201.11903 • Published • 14
Post Training
-
Aligning Instruction Tuning with Pre-training
Paper • 2501.09368 • Published -
Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey
Paper • 2403.14608 • Published -
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Paper • 2305.18290 • Published • 63
LLM and Reasoning Papers
Papers dump of LLM Reasoning domain
-
Internal Consistency and Self-Feedback in Large Language Models: A Survey
Paper • 2407.14507 • Published • 46 -
Large Language Models are Zero-Shot Reasoners
Paper • 2205.11916 • Published • 2 -
Let's Verify Step by Step
Paper • 2305.20050 • Published • 11 -
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Paper • 2201.11903 • Published • 14
models
5
knight7561/SmolLM2_python_coder-FT-ORPO
Text Generation
•
0.1B
•
Updated
•
8
knight7561/SmolLM2-FT-DPO-python-code
Text Generation
•
0.1B
•
Updated
•
6
knight7561/SmolLM2_python_coder
Text Generation
•
0.1B
•
Updated
•
32
knight7561/SmolLM2-eli5_precomputed_top_slice
Text Generation
•
0.1B
•
Updated
•
6
knight7561/SmolLM2-FT-MyDataset
Text Generation
•
0.1B
•
Updated
•
6
datasets
0
None public yet