Mistral AI, a Paris-based startup, has made significant strides in the AI industry by releasing a high-performing 7 billion parameter model under an open-source license. While the company champions the open-source ethos, it also acknowledges the need for commercial products.
The Paris-based startup Mistral AI has recently made waves in the AI community by open-sourcing a powerful 7 billion parameter generative language model called Mistral 7B. This model achieved state-of-the-art performance relative to its size, outperforming models like Anthropic's Llama 2 13B on multiple benchmarks. Mistral AI's commitment to releasing Mistral 7B under an Apache 2.0 license with no usage restrictions has been hailed as a win for the open-source AI movement. However, their parallel development of proprietary commercial models raises questions about the startup's future direction.
Mistral 7B: Pushing the Limits of Open Source AI
Mistral AI co-founders Arthus Mensch, Guillaume Lample, and Timothée Lacroix have impressive pedigrees, with experience at AI leaders like DeepMind and Meta AI. Their goal with Mistral 7B was to show that with the right techniques, smaller open-source models can match or exceed the capabilities of much larger proprietary models developed by big tech firms.
The impressive benchmark results back up this claim. By combining model parallelism, a mixture of experts, and data augmentation, Mistral 7B reaches performance on par with 13-34 billion parameter models on certain tests. Mistral also released a fine-tuned chat version that exceeds Llama 2 13B chat abilities.
Equally important is Mistral 7B's open-source Apache 2.0 license. This allows unrestricted research and commercial usage, unlike Anthropic's more limited Llama licenses. Mistral AI has fostered community engagement via GitHub and Discord, soliciting feedback to improve the model. This collaborative approach epitomizes the open-source ethos.
Open Source and Proprietary
Despite this commitment to open source, Mistral AI has indicated its next steps will involve developing proprietary commercial models optimized on private data. They argue this closed-source work will fund their ability to keep pushing the boundaries of open-source AI.
On one hand, developing private models targeted at revenue generation makes business sense for a startup needing to justify over $100 million in seed funding. Mistral must balance open ideals with financial realities. The co-founders state they will still provide model weights and code for these proprietary models.
However, this split focus diverges from the purist open-source vision that animated Mistral 7B's development. It mirrors the path of OpenAI, which began with lofty open-source ambitions but has increasingly pursued closed, monetized models like GPT-3. Mistral's rhetoric around enabling "external" AGI sounds eerily close to waits-and-see language used by other AI leaders.
A Stark Choice Ahead
Mistral AI's outstanding open source advance with Mistral 7B demonstrates the immense value of openness and collaboration in AI research. But their choice to also chase proprietary work highlights the perpetual tension between openness and competitiveness in this space.
Can Mistral AI stick to their open source "north star" as they follow the funding necessary to develop ever-larger models? Or will they go one way or the other: becoming either a purely open research collective or just another walled-off corporate AI lab?
Their path forward remains unclear. But the AI community is watching closely, eager to see if Mistral can fulfill its promise of pushing the boundaries of open-source AI while still attracting the resources to compete at the cutting edge. The startup's choices now will resonate for years to come.