Emin Temiz PRO

etemiz

AI & ML interests

Alignment

Recent Activity

View all activity

Organizations

None yet

etemiz's activity

posted an update 1 day ago
view post
Post
357
Mistral Small 3.1 numbers are in. It is interesting Mistral always lands in the middle.
https://sheet.zoho.com/sheet/open/mz41j09cc640a29ba47729fed784a263c1d08?sheetid=0&range=A1

I started to do the comparison with 2 models now. In the past Llama 3.1 70B Q4 was the one doing the comparison of answers. Now I am using Gemma 3 27B Q8 as well to have a second opinion on it. Gemma 3 produces very similar measurement to Llama 3.1. So the end result is not going to shake much.
  • 1 reply
ยท
replied to their post 5 days ago
view reply

Looks like we need more mature tools for Gemma 3, it is failing to fine tune like half of the time. Unsloth and transformers are getting ready. And I am trying lower learning rates and rank stabilized LoRa, and different r, lora_alpha.

reacted to their post with ๐Ÿš€ 5 days ago
view post
Post
1660
Started fine tuning Gemma 3 using evolutionary approach. It is not the worst model according to AHA leaderboard and it is one of the smart according to lmarena.ai. My objective is to make it based, anti woke, wise, beneficial and then some.

Several GPUs are fine tuning it at the same time, each using a different dataset and using QLoRA and the successful ones are merged later. Compared to LoRa this allows faster training and also reduced overfitting because the merge operation heals overfitting. The problem with this could be the 4 bit quantization may make models dumber. But I am not looking for sheer IQ. Too much mind is a problem anyway :)

Has anyone tried parallel QLoRa and merge before?

I also automated the dataset selection and benchmarking and converging to objectives (the fit function, the reward). It is basically trying to get higher score in AHA Leaderboard as fast as possible with a diverse set of organisms that "evolve by training".

I want to release some cool stuff when I have the time:
- how an answer to a single question changes over time, with each training round or day
- a chart to show AHA alignment over training rounds
  • 3 replies
ยท
posted an update 6 days ago
view post
Post
1660
Started fine tuning Gemma 3 using evolutionary approach. It is not the worst model according to AHA leaderboard and it is one of the smart according to lmarena.ai. My objective is to make it based, anti woke, wise, beneficial and then some.

Several GPUs are fine tuning it at the same time, each using a different dataset and using QLoRA and the successful ones are merged later. Compared to LoRa this allows faster training and also reduced overfitting because the merge operation heals overfitting. The problem with this could be the 4 bit quantization may make models dumber. But I am not looking for sheer IQ. Too much mind is a problem anyway :)

Has anyone tried parallel QLoRa and merge before?

I also automated the dataset selection and benchmarking and converging to objectives (the fit function, the reward). It is basically trying to get higher score in AHA Leaderboard as fast as possible with a diverse set of organisms that "evolve by training".

I want to release some cool stuff when I have the time:
- how an answer to a single question changes over time, with each training round or day
- a chart to show AHA alignment over training rounds
  • 3 replies
ยท
posted an update 8 days ago
upvoted an article 10 days ago
published an article 10 days ago
posted an update 13 days ago
view post
Post
1318
Benchmarked Gemma 3 today. It has better knowledge compared to 2 but still in the median area in the leaderboard.
  • 1 reply
ยท
posted an update 20 days ago
view post
Post
1685
Benchmarked QwQ for the AHA Leaderboard. Compared to Qwen 2.5 knows nutrition and fasting better but lacks faith.

  • 1 reply
ยท
posted an update 26 days ago
published an article 27 days ago
New activity in some1nostr/Nostr-Llama-3.1-8B 28 days ago

Update README.md

#1 opened 28 days ago by
etemiz
posted an update about 1 month ago
view post
Post
560
https://www.youtube.com/watch?v=EMyAGuHnDHk

In the video above some LLMs favored atheist, some favored the believer. In the picture below, atheist favoring LLMs are on the left, believer favoring LLMs are on the right.

The ones on the left are also lower ranking in my leaderboard and the ones on the right are higher ranking. My leaderboard:
https://sheet.zohopublic.com/sheet/published/mz41j09cc640a29ba47729fed784a263c1d08

Coincidence? My leaderboard has more domains. Does ranking high in faith mean ranking high in healthy living, nutrition, bitcoin and nostr on average?
reacted to clem's post with ๐Ÿ‘ about 1 month ago
view post
Post
2830
What are the best organizations to follow on @huggingface ?

On top of my head:
- Deepseek (35,000 followers): https://huggingface.co/deepseek-ai
- Meta Llama (27,000 followers): https://huggingface.co/meta-llama
- Black Forrest Labs (11,000 followers): https://huggingface.co/black-forest-labs
- OpenAI (5,000 followers): https://huggingface.co/openai
- Nvidia (16,000 followers): https://huggingface.co/nvidia
- MIcrosoft (9,000 followers): https://huggingface.co/microsoft
- AllenAI (2,000 followers): https://huggingface.co/allenai
- Mistral (5,000 followers): https://huggingface.co/mistralai
- XAI (600 followers): https://huggingface.co/xai-org
- Stability AI (16,000 followers): https://huggingface.co/stabilityai
- Qwen (16,000 followers): https://huggingface.co/Qwen
- GoogleAI (8,000 followers): https://huggingface.co/google
- Unsloth (3,000 followers): https://huggingface.co/unsloth
- Bria AI (4,000 followers): https://huggingface.co/briaai
- NousResearch (1,300 followers): https://huggingface.co/NousResearch

Bonus, the agent course org with 17,000 followers: https://huggingface.co/agents-course
  • 1 reply
ยท
posted an update about 1 month ago
view post
Post
1803
--- AHA Leaderboard ---

We all want AI to be properly aligned so it benefits humans with every answer it generates. While there are tremendous research around this and so many people working on it, I am choosing another route: Curation of people and then curation of datasets that are used in the LLM training. Curation of datasets comprising of people who try to uplift humanity should result in LLMs that try to help humans.

This work has revolved around two tasks:

1. Making LLMs that are benefiting humans
2. Measuring misinformation in other LLMs

The idea about the second task is, once we make and gather better LLMs and set them as "ground truth" we now can measure how much other LLMs are distancing themselves from those ground truths.
For that I am working on something I will call "AHA Leaderboard" (AHA stands for AI -- human alignment).

Link to the spreadsheet:

https://sheet.zohopublic.com/sheet/published/mz41j09cc640a29ba47729fed784a263c1d08

The columns are ground truths. The rows are the mainstream LLMs. If a mainstream LLM produces similar answers to the ground truth LLM, it gets a higher score. The LLMs that are higher in the leaderboard should be considered aligned with humans. Simple idea. This is like analyzing LLMs in different domains asking hundreds of questions and checking if they match the answers that try to mimic humans that care about other humans. Will it going to be effective? What do you think?

We want mainstream LLMs to copy answers of ground truth LLMs in certain domains. This may refocus AI towards being more beneficial. There have been 5 content providers and 6 curators as of now in the project. Join us and be one of the pioneers that fixed AI! You can be a curator, content provider or general researcher or something else.
posted an update about 1 month ago