video-p2p-library

AI & ML interests

None defined yet.

video-p2p-library's activity

AtAndDevย 
posted an update 9 days ago
view post
Post
2668
deepseek-ai/DeepSeek-R1-0528

This is the end
  • 1 reply
ยท
Nymboย 
posted an update 28 days ago
view post
Post
2236
Haven't seen this posted anywhere - Llama-3.3-8B-Instruct is available on the new Llama API. Is this a new model or did someone mislabel Llama-3.1-8B?
  • 1 reply
ยท
Nymboย 
posted an update about 1 month ago
view post
Post
2058
PSA for anyone using Nymbo/Nymbo_Theme or Nymbo/Nymbo_Theme_5 in a Gradio space ~

Both of these themes have been updated to fix some of the long-standing inconsistencies ever since the transition to Gradio v5. Textboxes are no longer bright green and in-line code is readable now! Both themes are now visually identical across versions.

If your space is already using one of these themes, you just need to restart your space to get the latest version. No code changes needed.
AtAndDevย 
posted an update 2 months ago
view post
Post
3021
Llama 4 is out...
ยท
AtAndDevย 
posted an update 3 months ago
view post
Post
4293
There seems to multiple paid apps shared here that are based on models on hf, but some ppl sell their wrappers as "products" and promote them here. For a long time, hf was the best and only platform to do oss model stuff but with the recent AI website builders anyone can create a product (really crappy ones btw) and try to sell it with no contribution to oss stuff. Please dont do this, or try finetuning the models you use...
Sorry for filling yall feed with this bs but yk...
  • 6 replies
ยท
AtAndDevย 
posted an update 3 months ago
view post
Post
1625
Gemma 3 seems to be really good at human preference. Just waiting for ppl to see it.
ehristoforuย 
posted an update 3 months ago
view post
Post
3334
Introducing our first standalone model โ€“ FluentlyLM Prinum

Introducing the first standalone model from Project Fluently LM! We worked on it for several months, used different approaches and eventually found the optimal one.

General characteristics:
- Model type: Causal language models (QwenForCausalLM, LM Transformer)
- Number of parameters: 32.5B
- Number of parameters (not embedded): 31.0B
- Number of layers: 64
- Context: 131,072 tokens
- Language(s) (NLP): English, French, Spanish, Russian, Chinese, Japanese, Persian (officially supported)
- License: MIT

Creation strategy:
The basis of the strategy is shown in Pic. 2.
We used Axolotl & Unsloth for SFT-finetuning with PEFT LoRA (rank=64, alpha=64) and Mergekit for SLERP and TIES mergers.

Evolution:
๐Ÿ† 12th place in the Open LLM Leaderboard ( open-llm-leaderboard/open_llm_leaderboard) (21.02.2025)

Detailed results and comparisons are presented in Pic. 3.

Links:
- Model: fluently-lm/FluentlyLM-Prinum
- GGUF version: mradermacher/FluentlyLM-Prinum-GGUF
- Demo on ZeroGPU: ehristoforu/FluentlyLM-Prinum-demo
  • 7 replies
ยท
AtAndDevย 
posted an update 4 months ago
view post
Post
2461
@nroggendorff is that you sama?
  • 2 replies
ยท
ameerazam08ย 
posted an update 4 months ago
AtAndDevย 
posted an update 4 months ago
view post
Post
1913
everywhere i go i see his face
AtAndDevย 
posted an update 5 months ago
view post
Post
551
Deepseek gang on fire fr fr
AtAndDevย 
posted an update 5 months ago
view post
Post
1631
R1 is out! And with a lot of other R1 releated models...
ehristoforuย 
posted an update 6 months ago
view post
Post
4168
โœ’๏ธ Ultraset - all-in-one dataset for SFT training in Alpaca format.
fluently-sets/ultraset

โ“ Ultraset is a comprehensive dataset for training Large Language Models (LLMs) using the SFT (instruction-based Fine-Tuning) method. This dataset consists of over 785 thousand entries in eight languages, including English, Russian, French, Italian, Spanish, German, Chinese, and Korean.

๐Ÿคฏ Ultraset solves the problem faced by users when selecting an appropriate dataset for LLM training. It combines various types of data required to enhance the model's skills in areas such as text writing and editing, mathematics, coding, biology, medicine, finance, and multilingualism.

๐Ÿค— For effective use of the dataset, it is recommended to utilize only the "instruction," "input," and "output" columns and train the model for 1-3 epochs. The dataset does not include DPO or Instruct data, making it suitable for training various types of LLM models.

โ‡๏ธ Ultraset is an excellent tool to improve your language model's skills in diverse knowledge areas.
akhaliqย 
posted an update 6 months ago
view post
Post
20609
Google drops Gemini 2.0 Flash Thinking

a new experimental model that unlocks stronger reasoning capabilities and shows its thoughts. The model plans (with thoughts visible), can solve complex problems with Flash speeds, and more

now available in anychat, try it out: https://huggingface.co/spaces/akhaliq/anychat
ยท