The MosaicML platform enables you to easily train large AI models on your data, in your secure environment.
Build your next model.
Train large AI models at scale with a single command. Point to your S3 bucket and go. We handle the rest — orchestration, efficiency, node failures, infrastructure. Simple and scalable.
Stay on the modeling frontier with our latest recipes and techniques, rigorously tested by our research team.
With a few simple steps, deploy inside your private cloud. Your training data and models never leave your firewalls. Start training in one cloud, stop and resume on another — without skipping a beat.
Freedom to own your model entirely, including the model weights. Introspect and better explain the model decisions. Filter the content and data based on your business rules.
Seamlessly integrate with your existing workflows, experiment trackers, and data pipelines. Our platform is fully interoperable, cloud agnostic, and enterprise proven.
Run more experiments in less time with our world-leading efficiency optimizations. We’ve solved the hard engineering and systems problems for you. Train with confidence that no performance was left behind.
Choose just the pieces you need from our modular training stack. Modify our starter code however you want. Our unopinionated tools make it easier, not harder, to implement your ideas.
With the MosaicBERT architecture + training recipe, you can now pretrain a competitive BERT-Base model from scratch on the MosaicML platform for $20. We’ve released the pretraining and finetuning code, as well as the pretrained weights.
The MosaicML platform is designed to tackle the challenges of training large models such as ChatGPT, LaMDA, and Stable Diffusion. Our blog post breaks down the difficulties of training such models, and shows how our platform makes training large AI models easier.
Loading your training data becomes an escalating challenge as datasets grow bigger in size and the number of nodes scales. We built StreamingDataset to make training on large datasets from cloud storage as fast, cheap, and scalable as possible. Specially designed for multi-node, distributed training, StreamingDataset maximizes correctness guarantees, performance, and ease of use.
With MosaicML you can now evaluate LLMs on in-context learning tasks (LAMBADA, HellaSwag, PIQA, and more) hundreds of times faster than other evaluation harnesses. For 70B parameter models, LAMBADA takes only 100 seconds to evaluate on 64 A100 GPUs, and evaluation of a 1.2 trillion parameter model takes less than 12 minutes when using 256 NVIDIA A100 GPUs.
The Stanford Center for Research on Foundation Models (CRFM) and MosaicML announce the release of BioMedLM, a purpose-built AI model trained to interpret biomedical language. Editorial update: this blog post was revised on 1/30/2023 to reflect name change from PubMed GPT.
They achieved astonishing results in their first MLPerf publication, beating NVIDIA’s optimized model by 17%, and the unoptimized model by 4.5x.
Packaging many algorithmic speedups in an easy-to-use API is quite a nice product.
We got done in two days what would have taken us a month.
MosaicML researchers train large-scale vision and language models across multiple GPUs and nodes every single day. They understand how scalable research pipelines should be constructed.
Talk to our ML training experts and discover how MosaicML can help you on your ML journey.
Join us if you want to build world class ML training systems.
Open-source PyTorch library to plug and play speed-ups with just a few lines of code.
20+ speed-up methods for neural network training, rooted in our rigorous research.
Develop the best solutions to the most challenging problems in ML today.