Alberto Cetoli PRO

fractalego

AI & ML interests

Entity/relation extraction, Q&A, Summarisation

Recent Activity

View all activity

Organizations

Blog-explorers's profile picture Hugging Face Discord Community's profile picture open/ acc's profile picture

fractalego's activity

reacted to BrigitteTousi's post with πŸ”₯πŸš€ about 22 hours ago
reacted to csabakecskemeti's post with πŸ”₯ 12 days ago
view post
Post
2750
Testing Training on AMD/ROCm the first time!

I've got my hands on an AMD Instinct MI100. It's about the same price used as a V100 but on paper has more TOPS (V100 14TOPS vs MI100 23TOPS) also the HBM has faster clock so the memory bandwidth is 1.2TB/s.
For quantized inference it's a beast (MI50 was also surprisingly fast)

For LORA training with this quick test I could not make the bnb config works so I'm running the FT on the fill size model.

Will share all the install, setup and setting I've learned in a blog post, together with the cooling shroud 3D design.
Β·
upvoted an article 12 days ago
reacted to burtenshaw's post with πŸ”₯ 15 days ago
view post
Post
6188
Now the Hugging Face agent course is getting real! With frameworks like smolagents, LlamaIndex, and LangChain.

πŸ”— Follow the org for updates https://huggingface.co/agents-course

This week we are releasing the first framework unit in the course and it’s on smolagents. This is what the unit covers:

- why should you use smolagents vs another library?
- how to build agents that use code
- build multiagents systems
- use vision language models for browser use

The team has been working flat out on this for a few weeks. Led by @sergiopaniego and supported by smolagents author @m-ric .
upvoted an article 15 days ago
view article
Article

FastRTC: The Real-Time Communication Library for Python

β€’ 140
reacted to csabakecskemeti's post with πŸ‘ 25 days ago
reacted to mmhamdy's post with πŸ‘ 27 days ago
view post
Post
2967
β›“ Evaluating Long Context #2: SCROLLS and ZeroSCROLLS

In this series of posts about tracing the history of long context evaluation, we started with Long Range Arena (LRA). Introduced in 2020, Long Range Arens (LRA) is one of the earliest benchmarks designed to tackle the challenge of long context evaluation. But it wasn't introduced to evaluate LLMs, but rather the transformer architecture in general.

πŸ“œ The SCROLLS benchmark, introduced in 2022, addresses this gap in NLP/LLM research. SCROLLS challenges models with tasks that require reasoning over extended sequences (according to 2022 standards). So, what does it offer?

1️⃣ Long Text Focus: SCROLLS (unlike LRA) focus mainly on text and contain inputs with thousands of words, testing models' ability to synthesize information across lengthy documents.
2️⃣ Diverse Tasks: Includes summarization, question answering, and natural language inference across domains like literature, science, and business.
3️⃣ Unified Format: All datasets are available in a text-to-text format, facilitating easy evaluation and comparison of models.

Building on SCROLLS, ZeroSCROLLS takes long text evaluation to the next level by focusing on zero-shot learning. Other features include:

1️⃣ New Tasks: Introduces tasks like sentiment aggregation and sorting book chapter summaries.
2️⃣ Leaderboard: A live leaderboard encourages continuous improvement and competition among researchers.

πŸ’‘ What are some other landmark benchmarks in the history of long context evaluation? Feel free to share your thoughts and suggestions in the comments.

- SCROLLS Paper: SCROLLS: Standardized CompaRison Over Long Language Sequences (2201.03533)
- ZeroSCROLLS Paper: ZeroSCROLLS: A Zero-Shot Benchmark for Long Text Understanding (2305.14196)
upvoted an article about 1 month ago
view article
Article

ColPali: Efficient Document Retrieval with Vision Language Models πŸ‘€

By manu β€’
β€’ 214
upvoted an article about 2 months ago
view article
Article

Visualize and understand GPU memory in PyTorch

β€’ 194