Introduction
Recently, there have been major advances in large language models (LLMs) - AI systems trained on massive text datasets that can understand and generate human language at an impressive level.
Models like OpenAI's GPT-3 and Google's PaLM/Gemini have demonstrated abilities like conversational chat, answering questions, summarizing texts, and even translating between languages.
However, these advanced LLMs often have hundreds of billions or even trillions of parameters, requiring substantial computing resources to train and run. This has spurred interest in developing techniques to create smaller yet still highly capable language models.
Microsoft's newly announced Phi-2 model exemplifies this push