With the big hype around AI agents these days, I couldnโt stop thinking about how AI agents could truly enhance real-world activities. What sort of applications could we build with those AI agents: agentic RAG? self-correcting text-to-sql? Nah, boringโฆ
Passionate about outdoors, Iโve always dreamed of a tool that could simplify planning mountain trips while accounting for all potential risks. Thatโs why I built ๐๐น๐ฝ๐ถ๐ป๐ฒ ๐๐ด๐ฒ๐ป๐, a smart assistant designed to help you plan safe and enjoyable itineraries in the French Alps and Pyrenees.
Built using Hugging Face's ๐๐บ๐ผ๐น๐ฎ๐ด๐ฒ๐ป๐๐ library, Alpine Agent combines the power of AI with trusted resources like ๐๐ฌ๐ช๐ต๐ฐ๐ถ๐ณ.๐ง๐ณ (https://skitour.fr/) and METEO FRANCE. Whether itโs suggesting a route with moderate difficulty or analyzing avalanche risks and weather conditions, this agent dynamically integrates data to deliver personalized recommendations.
In my latest blog post, I share how I developed this projectโfrom defining tools and integrating APIs to selecting the best LLMs like ๐๐ธ๐ฆ๐ฏ2.5-๐๐ฐ๐ฅ๐ฆ๐ณ-32๐-๐๐ฏ๐ด๐ต๐ณ๐ถ๐ค๐ต, ๐๐ญ๐ข๐ฎ๐ข-3.3-70๐-๐๐ฏ๐ด๐ต๐ณ๐ถ๐ค๐ต, or ๐๐๐-4.
Today, we spoke with Snowflakeโs AI Research Team Leads, Yuxiong He and Samyam Rajbhandari (@samyam) (he is also one the researchers behind DeepSpeed-FastGen: High-throughput Text Generation for LLMs via MII and
DeepSpeed-Inference (2401.08671) and other DeepSpeed papers) Collaborating with their co-authors to reduce inference costs for enterprise-specific tasks, they observed that inputs are often significantly larger than outputs. This is because itโs in the nature of enterprises to analyze enormous amounts of information trying to extract valuable insights, which are much shorter. To address this, they developed SwiftKV SwiftKV: Fast Prefill-Optimized Inference with Knowledge-Preserving
Model Transformation (2410.03960), an optimization that reduces LLM inference costs by up to 75% for Meta Llama LLMs, enhancing efficiency and performance in enterprise AI tasks.
Today they are open-sourcing SwiftKV (Snowflake/Llama-3.1-SwiftKV-8B-Instruct) and ArcticTrainging Platform. In our new episode "15 minutes with a Researcher" they explain how SwiftKV works, its applicability to other architectures, its limitations, and additional methods to further reduce computation costs in inference. Watch the full 15 min interview here (https://youtu.be/9x1k7eXe-6Q?si=4_HQOyi1CPHgvlrx)
reacted to zamal's
post with ๐about 14 hours ago
Interact with your PDF documents like never before! ๐คฏ Extract text & images, then ask context-aware questions based on both. Powered by RAG techniques & multimodal LLMs. Perfect for studying, research & more! ๐๐ Try it out now!!!! โ๏ธ