Foldable SuperNets: Scalable Merging of Transformers with Different Initializations and Tasks

Published in Transactions on Machine Learning Research (TMLR), 2025

Recommended citation: Edan Kinderman, Itay Hubara, Haggai Maron, Daniel Soudry. (2025). "Foldable SuperNets: Scalable Merging of Transformers with Different Initializations and Tasks." TMLR 2025. https://openreview.net/pdf?id=6FqwLestHv

We propose Foldable SuperNet (FoldSN), a novel method for merging multiple Transformer models trained on different tasks and initializations into a single, scalable SuperNet. This approach enables dynamic resource allocation and efficient multi-task inference.

Download paper here