Vincent Granville PRO
vincentg64
AI & ML interests
GenAI, LLM, synthetic data, optimization, fine-tuning, model evaluation
Recent Activity
posted
an
update
27 days ago
A New Type of Non-Standard High Performance DNN with Remarkable Stability – https://mltblog.com/3SA3OJ1
I explore deep neural networks (DNNs) starting from the foundations, introducing a new type of architecture, as much different from machine learning than it is from traditional AI. The original adaptive loss function introduced here for the first time, leads to spectacular performance improvements via a mechanism called equalization.
To accurately approximate any response, rather than connecting neurons with linear combinations and activation between layers, I use non-linear functions without activation, reducing the number of parameters, leading to explainability, easier fine tune, and faster training. The adaptive equalizer – a dynamical subsystem of its own – eliminates the linear part of the model, focusing on higher order interactions to accelerate convergence.
One example involves the Riemann zeta function. I exploit its well-known universality property to approximate any response. My system also handles singularities to deal with rare events or fraud detection. The loss function can be nowhere differentiable such as a Brownian motion. Many of the new discoveries are applicable to standard DNNs. Built from scratch, the Python code does not rely on any library other than Numpy. In particular, I do not use PyTorch, TensorFlow or Keras.
➡️ The PDF with many illustrations is available as paper 55, at https://mltblog.com/3EQd2cA. It also features the replicable Python code (with link to GitHub), the data generated by the code, the theory, and various options including for evaluation.
posted
an
update
about 2 months ago
How to Design LLMs that Don’t Need Prompt Engineering https://mltblog.com/3GAbAQu
Standard LLMs rely on prompt engineering to fix problems (hallucinations, poor response, missing information) that come from issues in the backend architecture. If the backend (corpus processing) is properly built from the ground up, it is possible to offer a full, comprehensive answer to a meaningful prompt, without the need for multiple prompts, rewording your query, having to go through a chat session, or prompt engineering. In this article, I explain how to do it, focusing on enterprise corpuses. The strategy relies on four principles:
➡️ Exact and augmented retrieval
➡️ Showing full context in the response
➡️ Enhanced UI with option menu
➡️ Structured response as opposed to long text
I now explain these principles.
Read full article at https://mltblog.com/3GAbAQu
#xLLM #BondingAI #PromptEngineering
Organizations
None yet