Microsoft's Phi-3 Mini Model Released

No Image
No Image
Source Link

Hot on the heels of last week's Llama 3 release, Microsoft has unveiled its latest offering: Phi-3. The Phi-3 series comprises three different sizes—mini (3.8B), small (7B), and medium (14B)—with the mini version making its debut today. Excitingly, Phi-3-mini claims to rival the performance of Llama 3 8B, promising significant advancements in language model capabilities. Phi-3-mini brings a host of features and achievements to the table: Instruct Versions: Released under the MIT license, Phi-3-mini offers two Instruct versions with 4k and 128k context capabilities. Training and Optimization: Trained on an extensive dataset of 3.3 trillion tokens and fine-tuned using SFT (Self-Feeding Transformer) and DPO (Direct Policy Optimization). Performance Metrics: Achieving impressive scores of 68.8 on the MMLU benchmark and 8.38 on the MT Bench, Phi-3-mini surpasses benchmarks set by models like Mistral 7B and Llama 3 8B Instruct. Accessibility and Compatibility: Phi-3-mini is optimized for Hugging Face's Text Generation Inference and is available on both the Hugging Face platform and Hugging Chat. Additionally, it runs seamlessly on Android and iPhones, enhancing its accessibility. Technical Reports and Recommendations: Utilizing a combination of web and synthetic data for training, Phi-3-mini focuses solely on English intent use. While the medium-sized model (Phi-3-medium) outperforms OpenAI GPT-3.5 on benchmarks, the small version (Phi-3-small) surpasses Llama 3 8B Instruct. Remaining Questions: However, some questions remain unanswered, such as the dataset mix proportions and potential data/benchmark contamination. Details regarding the release of Phi-3-small and medium are also awaited. The introduction of Phi-3-mini marks another significant stride in the evolution of language models, offering enhanced performance, accessibility, and versatility for a wide range of applications. With its release, Microsoft continues to push the boundaries of AI innovation and democratize access to advanced language technologies.