JoseRFJunior (Junior R F Junior)

reacted to AtAndDev's post with ❤️🤗 27 days ago

Post

2788

deepseek-ai/DeepSeek-R1-0528

This is the end

1 reply

·

reacted to their post with ❤️ 11 months ago

Post

1710

JoseRFJunior/TransNAR
https://github.com/JoseRFJuniorLLMs/TransNAR
https://arxiv.org/html/2406.09308v1
TransNAR hybrid architecture. Similar to Alayrac et al, we interleave existing Transformer layers with gated cross-attention layers which enable information to flow from the NAR to the Transformer. We generate queries from tokens while we obtain keys and values from nodes and edges of the graph. The node and edge embeddings are obtained by running the NAR on the graph version of the reasoning task to be solved. When experimenting with pre-trained Transformers, we initially close the cross-attention gate, in order to fully preserve the language model’s internal knowledge at the beginning of training.

posted an update 11 months ago

Post

1710

JoseRFJunior/TransNAR
https://github.com/JoseRFJuniorLLMs/TransNAR
https://arxiv.org/html/2406.09308v1
TransNAR hybrid architecture. Similar to Alayrac et al, we interleave existing Transformer layers with gated cross-attention layers which enable information to flow from the NAR to the Transformer. We generate queries from tokens while we obtain keys and values from nodes and edges of the graph. The node and edge embeddings are obtained by running the NAR on the graph version of the reasoning task to be solved. When experimenting with pre-trained Transformers, we initially close the cross-attention gate, in order to fully preserve the language model’s internal knowledge at the beginning of training.

reacted to Jaward's post with ❤️ about 1 year ago

Post

4521

This is the closest I’ve seen of a scalable AI/LLM Operating System - it has all the major ingredients of a feasible AI OS 1 architecture:

- Extends classical OS functionalities with an LLM Kernel.
- Multi agent-centric approach.
- Optimized resource allocation system that allows for LLM-based tasks and Classical OS tasks to coexist.
- An Agent Scheduler that can perform classical os operations (FIFO, RR).
- A Context Manager to improve alignment.
- Lazy Memory Manager for agents (ensures data is stored and accessible only while the agent is active)
- An Enhanced security module for the AI-driven environment.

It does hit all checkpoints, doesn’t it? An upscale version of @karpathy ’s.

Code: https://github.com/agiresearch/AIOS

4 replies

·

Junior R F Junior

AI & ML interests

Recent Activity

Organizations

Junior R F Junior

AI & ML interests

Recent Activity

Organizations

JoseRFJunior's activity