New in Composer 0.12: Mid-Epoch Resumption with MosaicML Streaming, CometML ImageVisualizer, HuggingFace Model and Tokenizer Loading, and more!
We are excited to announce the release of Composer 0.12 (release notes)! This release includes several new features, plus improvements to existing capabilities - kudos to the Composer community for your engagement, feedback, and contributions to this release.
For those who want to join the Composer community: learn more about contributing here, and message us on Slack if you have any questions or suggestions!
There are many features included in this release, so let’s dive in and see what’s new.
What's in this release
- CometML ImageVisualizer to visualize images and segmentation masks while training your computer vision models
- Oracle Cloud Infrastructure ObjectStore and Google Cloud Storage support for saving and loading checkpoints
- Mid-Epoch Resumption with MosaicML Streaming support
- Support for loading HuggingFace models and tokenizers for easier pre-training to fine-tuning workflow
- OptimizerMonitor updates to track optimizer-specific metrics
- New PyTorch 1.13 + CUDA 11.7 Docker images
- New community contribution for speedup algorithm - GyroDropout
The Composer library helps developers train PyTorch neural networks faster, at lower cost, and to higher accuracy. Composer includes:
- 20+ methods for speeding up training networks for computer vision and language modeling.
- An easy-to-use trainer that integrates best practices for efficient training.
- Functional forms of all speedup methods that are easy to integrate into your existing training loop.
- Strong and reproducible training baselines to get you started as quickly as possible.
Visualize your images and segmentation masks while training with CometML ImageVisualizer callback
Train the MosaicML DeepLabV3 that yields a 5x faster time-to-train compared to a strong baseline. See our latest segmentation blog for more details and recipes, and start training today with our example starter code for segmentation.
Save and load your checkpoints using Oracle Cloud Infrastructure (OCI) or Google Cloud Storage (GCS)
With this latest Composer release, you can save or load your checkpoints using OCI or GCS. We added direct support for Oracle Cloud Infrastructure (OCI) as an ObjectStore and support for Google Cloud Storage (GCS) via URI.
Track your training metrics with MLflow logging
We've added support for using MLflow to log experiment metrics and hyperparameters. Please see our Logging documentation for additional details.
Simplified console and progress bar logging
Support for Mid-Epoch Resumption and more with the latest MosaicML Streaming Library v0.2 release
We've added support in Composer for the latest v0.2 release of our Streaming library. This includes awesome new features like instant mid-epoch resumption and deterministic shuffling, regardless of the number of nodes.
MosaicML Streaming is a PyTorch-compatible dataset that enables users to stream training data from cloud-based object stores. MosaicML Streaming can read files from local disk or from cloud-based object stores.
Key Benefits of the MosaicML Streaming library include:
- High performance, accurate streaming of training data from cloud storage
- Efficiently train anywhere, independent of training data location
- Cloud-native, no persistent storage required
- Enhanced data security—data exists ephemerally on training cluster
See the MosaicML Streaming release notes for more about the latest updates, and stay tuned for a blog coming up detailing more about the MosaicML Streaming Library!
Load a 🤗 HuggingFace Model and Tokenizer out of a Composer checkpoint
The model pre-training to fine-tuning workflow in Composer with HuggingFace models is now easier than ever! We've added a new utility to load a HuggingFace model and tokenizer out of a Composer checkpoint.
Getting started is easy: check out our example notebook for a full tutorial on pre-training and fine-tune a HuggingFace transformer using the Composer library, read the docs for more details.
Track Optimizer Specific Metrics with OptimizerMonitor (previously GradMonitor)
Check out the docs for more details, and add to your code just like any other callback!
New PyTorch and CUDA versions
New community speedup algorithm: Gyro Dropout
Special thank you to Junyeol Lee and Gihyun Park at BDSL in Hanyang University for the Composer implementation of Gyro Dropout and the accompanying documentation.
We’re always looking for community contributions for speedup methods for Composer! Whether you’re looking to contribute your own speed-up method from your published research or need to do a final class project on efficient machine learning and want ideas, reach out to firstname.lastname@example.org to get started.
Thanks for reading! If you'd like to learn more about Composer and to be part of the community you are welcomed to download Composer and try it out for your training tasks. As you try it out, come be a part of our community by engaging with us on Twitter, joining our Slack channel, or just giving us a star on Github.
What’s a Rich Text element?
The rich text element allows you to create and format headings, paragraphs, blockquotes, images, and video all in one place instead of having to add and format them individually. Just double-click and easily create content.
Static and dynamic content editing
A rich text element can be used with static or dynamic content. For static content, just drop it into any page and begin editing. For dynamic content, add a rich text field to any collection and then connect a rich text element to that field in the settings panel. Voila!
How to customize formatting for each rich text
Headings, paragraphs, blockquotes, figures, images, and figure captions can all be styled after a class is added to the rich text element using the "When inside of" nested selector system.