Skip to main content

Building Models & Improvement Wishes -Llama Stack RFC

Building Models -Llama Stack RFC

Meta - wants to enable everyone to get the most out of the 405B,

such as 

  • Real-time and batch inference
  • Supervised fine-tuning
  • Evaluation of your model for your specific application
  • Continual pre-training
  • Retrieval-Augmented Generation (RAG)
  • Function calling
  • Synthetic data generation

RFC-0001 - Llama Stack · Issue #6 · meta-llama/llama-toolchain (github.com)

Model Improvement Wishes

Model Capabilities

  • Structured Output: Improved support for generating structured data (e.g., JSON) through restricted prediction.
  • Knowledge Distillation: Efficient tools and pipelines for transferring knowledge from larger models to smaller ones.

Fine-Tuning

  • Continued Pre-training: Easier access to sampled pre-training data for maintaining data distribution consistency.
  • Preference Optimization: Best practices and recipes for fine-tuning using Rejected Sampling (RS) and Direct Preference Optimization (DPO).

Model Architecture

  • Agentic Capabilities: Tools and interfaces for integrating Monte Carlo Tree Search (MCTS) with LLMs to enhance logical reasoning.

Disciussion

Models

  • Restricted Prediction: It would be awesome to have native and efficient support for restricted prediction along with a pre-defined schema, such as JSON schema support in llama.cpp grammars. This could boost the usability of the models for generating reliable structured data.

Fine-tuning

  • Continued Pretraining: Maintaining the original data distribution during continued pre-training is tricky. A usual approach is to mix sampled pre-training data with new training data. If Llama models could provide access to a sampled pre-training dataset, it would make this process a lot smoother and ensure consistency.

  • Knowledge Distillation: The 405B model is amazing. I wish that we could have an E2E knowledge distillation tool/pipeline/API to fine-tune using token distributions from teacher models.

  • RS/DPO: It is good to know that Llama3.1 has switched from RLFH to Rejected Sampling (RS) and Direct Preference Optimization (DPO) for optimizing preferences. It would be amazing to offer sample fine-tuning recipes or best practices that could help us fine-tune Llama3.1 models effectively and avoid overfitting.

Agentic

  • MCTS + LLM: There have been some cool attempts to use Monte Carlo Tree Search (MCTS) to boost the logical reasoning capabilities of LLMs. However, there’s still a gap when it comes to tools or viable paths for tightly integrating MCTS with LLMs, except for some rumored projects in closed-source models. Creating a robust tool or interface for this integration would be a huge win for AI agent developers.

Comments

Popular posts from this blog

AI Agents for Enterprise Leaders -Next Era of Organizational Transformation

  AI Agents for Enterprise Leaders: Charting a Course into the Next Era of Organizational Transformation Introduction AI agents and multiagent AI systems represent more than just technological advancements. They signify a fundamental shift in how organizations can automate processes, improve human-machine collaboration, generate insights, and respond dynamically to complex challenges. These systems offer the potential to unlock significant value across a wide range of functions—from enhancing customer interactions and optimizing supply chains to driving innovation in product development and service delivery. Realizing the Benefits To realize these benefits, organizations must engage in deliberate planning, make strategic investments, and foster a culture of continuous improvement and technological advancement. By aligning AI agent initiatives with core business goals, investing in the right infrastructure, and nurturing a culture of innovation, enterprises can position themselves t...

Airport twin basic requirements

  1. 3D Model of  New Terminal Arrivals Area: Develop a high-fidelity 3D model of the New Terminal Arrivals Area using provided LiDAR/CAD data and images. Include key elements like baggage carousels, immigration counters, customs checkpoints, and waiting areas. 2. Real-time Passenger Flow Monitoring: Integrate with Xovis and CCTV systems to track passenger movement in real-time. Visualize passenger flow on the 3D model, highlighting congestion areas and potential bottlenecks. Display real-time passenger count and density information on dashboards. 3. Baggage Handling Visualization: Integrate with the baggage handling system to track baggage movement in real-time. Visualize baggage flow on the 3D model, showing baggage movement from aircraft to carousels. Display real-time baggage status and potential delays on dashboards. 4. Security Monitoring: Integrate with CCTV feeds to monitor the Arrivals Area for suspicious activities. Implement AI-powered video analytics f...

The AI Revolution: Are You Ready? my speech text in multiple languages -Hindi,Arabic,Malayalam,English

  The AI Revolution: Are You Ready?  https://www.linkedin.com/company/105947510 CertifAI Labs My Speech text on Future of Tomorrow in English, Arabic ,Hindi and Malayalam , All translations done by Gemini LLM "Imagine a world with self-writing software, robots working alongside us, and doctors with instant access to all the world's medical information. This isn't science fiction, friends; this is the world AI is building right now. The future isn't a distant dream, but a wave crashing upon our shores, rapidly transforming the job landscape. The question isn't if this change will happen, but how we will adapt to it." "Think about how we create. For generations, software development was a complex art mastered by a select few. But what if anyone with an idea and a voice could bring that idea to life? What if a child could build a virtual solar system in minutes, simply by asking? We're moving towards a world where computers speak our language, paving the...