Ksenia Se

Kseniase

AI & ML interests

None yet

Recent Activity

upvoted an article about 24 hours ago
Topic 28: What is Mixture-of-Mamba?
published an article about 24 hours ago
Topic 28: What is Mixture-of-Mamba?
reacted to their post with 😎 about 24 hours ago
8 New Applications of Test-Time Scaling We've noticed a huge interest in test-time scaling (TTS), so we decided to explore this concept further. Test-time compute (TTC) refers to the amount of computational power used by an AI model when generating a response. Many researchers are now focused on scaling TTC, as it enables slow, deep "thinking" and step-by-step reasoning, which improves overall models' performance. Here are 8 fresh studies on test-time scaling: 1. https://huggingface.co/papers/2502.05171 Introduces an LM that scales TTC by reasoning in latent space instead of generating more tokens with no special training. Here, a recurrent block to processes information iteratively. 2. https://huggingface.co/papers/2502.04728 Shows how TTS is applied to enhance model's Planning Domain Definition Language (PDDL) reasoning capabilities, which can be used to generate a symbolic world model. 3. https://huggingface.co/papers/2502.06703 Analyzes optimal TTS strategies and shows how small models can outperform much larger ones. 4. https://huggingface.co/papers/2502.04128 Shows how TTS improves expressiveness, timbre consistency and accuracy in speech synthesis with Llasa framework. It also dives into benefits of scaling train-time compute. 5. https://huggingface.co/papers/2502.07154 Suggests a modified training loss for better reasoning of LLMs when scaling TTC. 6. https://huggingface.co/papers/2502.05078 Unifies the strengths of chain, tree, and graph paradigms into one framework that expands reasoning only on necessary subproblems. 7. https://huggingface.co/papers/2502.01839 Explores scaling trends of self-verification and how to improve its capabilities with TTC. 8. https://huggingface.co/papers/2501.14723 Explores how scaling serial compute (iterations) and parallel compute (trajectories), can improve accuracy in real-world software engineering issues. Also, explore our article about TTS for more -> https://huggingface.co/blog/Kseniase/testtimecompute
View all activity

Organizations

Turing Post's profile picture Journalists on Hugging Face's profile picture Social Post Explorers's profile picture Hugging Face Discord Community's profile picture Sandbox's profile picture

Kseniase's activity

upvoted an article about 24 hours ago
view article
Article

Topic 28: What is Mixture-of-Mamba?

By Kseniase and 1 other β€’
β€’ 2
published an article about 24 hours ago
view article
Article

Topic 28: What is Mixture-of-Mamba?

By Kseniase and 1 other β€’
β€’ 2
reacted to their post with πŸ˜ŽπŸ‘πŸš€πŸ”₯ about 24 hours ago
view post
Post
3011
8 New Applications of Test-Time Scaling

We've noticed a huge interest in test-time scaling (TTS), so we decided to explore this concept further. Test-time compute (TTC) refers to the amount of computational power used by an AI model when generating a response. Many researchers are now focused on scaling TTC, as it enables slow, deep "thinking" and step-by-step reasoning, which improves overall models' performance.

Here are 8 fresh studies on test-time scaling:

1. Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach (2502.05171)
Introduces an LM that scales TTC by reasoning in latent space instead of generating more tokens with no special training. Here, a recurrent block to processes information iteratively.

2. Generating Symbolic World Models via Test-time Scaling of Large Language Models (2502.04728)
Shows how TTS is applied to enhance model's Planning Domain Definition Language (PDDL) reasoning capabilities, which can be used to generate a symbolic world model.

3. Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling (2502.06703)
Analyzes optimal TTS strategies and shows how small models can outperform much larger ones.

4. Llasa: Scaling Train-Time and Inference-Time Compute for Llama-based Speech Synthesis (2502.04128)
Shows how TTS improves expressiveness, timbre consistency and accuracy in speech synthesis with Llasa framework. It also dives into benefits of scaling train-time compute.

5. Rethinking Fine-Tuning when Scaling Test-Time Compute: Limiting Confidence Improves Mathematical Reasoning (2502.07154)
Suggests a modified training loss for better reasoning of LLMs when scaling TTC.

6. Adaptive Graph of Thoughts: Test-Time Adaptive Reasoning Unifying Chain, Tree, and Graph Structures (2502.05078)
Unifies the strengths of chain, tree, and graph paradigms into one framework that expands reasoning only on necessary subproblems.

7. Sample, Scrutinize and Scale: Effective Inference-Time Search by Scaling Verification (2502.01839)
Explores scaling trends of self-verification and how to improve its capabilities with TTC.

8. CodeMonkeys: Scaling Test-Time Compute for Software Engineering (2501.14723)
Explores how scaling serial compute (iterations) and parallel compute (trajectories), can improve accuracy in real-world software engineering issues.

Also, explore our article about TTS for more -> https://huggingface.co/blog/Kseniase/testtimecompute
  • 1 reply
Β·
upvoted an article 3 days ago
view article
Article

🌁#88: Can DeepSeek Inspire Global Collaboration?

By Kseniase β€’
β€’ 3
published an article 4 days ago
view article
Article

🌁#88: Can DeepSeek Inspire Global Collaboration?

By Kseniase β€’
β€’ 3
posted an update 5 days ago
view post
Post
3011
8 New Applications of Test-Time Scaling

We've noticed a huge interest in test-time scaling (TTS), so we decided to explore this concept further. Test-time compute (TTC) refers to the amount of computational power used by an AI model when generating a response. Many researchers are now focused on scaling TTC, as it enables slow, deep "thinking" and step-by-step reasoning, which improves overall models' performance.

Here are 8 fresh studies on test-time scaling:

1. Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach (2502.05171)
Introduces an LM that scales TTC by reasoning in latent space instead of generating more tokens with no special training. Here, a recurrent block to processes information iteratively.

2. Generating Symbolic World Models via Test-time Scaling of Large Language Models (2502.04728)
Shows how TTS is applied to enhance model's Planning Domain Definition Language (PDDL) reasoning capabilities, which can be used to generate a symbolic world model.

3. Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling (2502.06703)
Analyzes optimal TTS strategies and shows how small models can outperform much larger ones.

4. Llasa: Scaling Train-Time and Inference-Time Compute for Llama-based Speech Synthesis (2502.04128)
Shows how TTS improves expressiveness, timbre consistency and accuracy in speech synthesis with Llasa framework. It also dives into benefits of scaling train-time compute.

5. Rethinking Fine-Tuning when Scaling Test-Time Compute: Limiting Confidence Improves Mathematical Reasoning (2502.07154)
Suggests a modified training loss for better reasoning of LLMs when scaling TTC.

6. Adaptive Graph of Thoughts: Test-Time Adaptive Reasoning Unifying Chain, Tree, and Graph Structures (2502.05078)
Unifies the strengths of chain, tree, and graph paradigms into one framework that expands reasoning only on necessary subproblems.

7. Sample, Scrutinize and Scale: Effective Inference-Time Search by Scaling Verification (2502.01839)
Explores scaling trends of self-verification and how to improve its capabilities with TTC.

8. CodeMonkeys: Scaling Test-Time Compute for Software Engineering (2501.14723)
Explores how scaling serial compute (iterations) and parallel compute (trajectories), can improve accuracy in real-world software engineering issues.

Also, explore our article about TTS for more -> https://huggingface.co/blog/Kseniase/testtimecompute
  • 1 reply
Β·
upvoted an article 6 days ago
view article
Article

🦸🏻#10: Does Present-Day GenAI Actually Reason?

By Kseniase β€’
β€’ 5
published an article 6 days ago
view article
Article

🦸🏻#10: Does Present-Day GenAI Actually Reason?

By Kseniase β€’
β€’ 5
upvoted an article 8 days ago
view article
Article

Topic 27: What are Chain-of-Agents and Chain-of-RAG?

By Kseniase and 1 other β€’
β€’ 10
published an article 8 days ago
view article
Article

Topic 27: What are Chain-of-Agents and Chain-of-RAG?

By Kseniase and 1 other β€’
β€’ 10
reacted to their post with πŸš€πŸ€—πŸ”₯ 9 days ago
view post
Post
7622
8 New Types of RAG

RAG techniques continuously evolve to enhance LLM response accuracy by retrieving relevant external data during generation. To keep up with current AI trends, new RAG types incorporate deep step-by-step reasoning, tree search, citations, multimodality and other effective techniques.

Here's a list of 8 latest RAG advancements:

1. DeepRAG -> DeepRAG: Thinking to Retrieval Step by Step for Large Language Models (2502.01142)
Models retrieval-augmented reasoning as a Markov Decision Process, enabling strategic retrieval. It dynamically decides when to retrieve external knowledge and when rely on parametric reasoning.

2. RealRAG -> RealRAG: Retrieval-augmented Realistic Image Generation via Self-reflective Contrastive Learning (2502.00848)
EnhancesΒ  novel object generation by retrieving real-world images and using self-reflective contrastive learning to fill knowledge gap, improve realism and reduce distortions.

3. Chain-of-Retrieval Augmented Generation (CoRAG) -> Chain-of-Retrieval Augmented Generation (2501.14342)
Retrieves information step-by-step and adjusts it, also deciding how much compute power to use at test time. If needed it reformulates queries.

4. VideoRAG -> VideoRAG: Retrieval-Augmented Generation over Video Corpus (2501.05874)
Enables unlimited-length video processing, using dual-channel architecture that integrates graph-based textual grounding and multi-modal context encoding.

5. CFT-RAG ->Β  CFT-RAG: An Entity Tree Based Retrieval Augmented Generation Algorithm With Cuckoo Filter (2501.15098)
A tree-RAG acceleration method uses an improved Cuckoo Filter to optimize entity localization, enabling faster retrieval.

6. Contextualized Graph RAG (CG-RAG) -> CG-RAG: Research Question Answering by Citation Graph Retrieval-Augmented LLMs (2501.15067)
Uses Lexical-Semantic Graph Retrieval (LeSeGR) to integrate sparse and dense signals within graph structure and capture citation relationships

7. GFM-RAG -> GFM-RAG: Graph Foundation Model for Retrieval Augmented Generation (2502.01113)
A graph foundation model that uses a graph neural network to refine query-knowledge connections

8. URAG -> URAG: Implementing a Unified Hybrid RAG for Precise Answers in University Admission Chatbots -- A Case Study at HCMUT (2501.16276)
A hybrid system combining rule-based and RAG methods to improve lightweight LLMs for educational chatbots
  • 1 reply
Β·
upvoted an article 11 days ago
view article
Article

🌁#87: Why DeepResearch Should Be Your New Hire

By Kseniase β€’
β€’ 5
published an article 11 days ago
view article
Article

🌁#87: Why DeepResearch Should Be Your New Hire

By Kseniase β€’
β€’ 5
replied to their post 11 days ago
view reply

Other important RAG advancements:

  • SafeRAG (a benchmark) -> https://huggingface.co/papers/2501.18636
    Establishes a security benchmark revealing how RAG systems are vulnerable to attacks like adversarial data injection, inter-context conflicts, and soft ad poisoning. Evaluates weaknesses in 14 RAG components, emphasizing the need for better filtering and security measures.

  • Topic-FlipRAG: Adversarial Opinion Manipulation -> https://huggingface.co/papers/2502.01386
    Demonstrates a two-stage adversarial attack that manipulates RAG-generated opinions on sensitive topics. Alters retrieval rankings and LLM reasoning to subtly flip the stance of generated answers, exposing the difficulty of mitigating semantic-level manipulation.

  • Experiments with LLMs on RAG for Closed-Source Simulation Software -> https://huggingface.co/papers/2502.03916
    Tests how RAG can support proprietary software by injecting relevant documentation dynamically. Shows that retrieval helps mitigate hallucinations in closed-source contexts, though some knowledge gaps remain, necessitating further improvements.

  • Health-RAG -> https://huggingface.co/papers/2502.04666
    Focuses on medical information retrieval by introducing a three-stage pipeline: retrieve, generate a reference summary (GenText), and re-rank based on factual alignment. Ensures accurate, evidence-backed health answers while mitigating misinformation risks.