A Semantically Consistent Dataset for Data-Efficient Query-Based Universal Sound Separation
Bee
Submitted to ICML 2026
[Paper 📝]
[Code ⚙️]
[Hive Dataset 🐝]

Overview of the proposed automated pipeline. The framework consists of three coupled stages: (1) Ontology Reconstruction & Data Preprocessing. (2) Single-source Semantic-acoustic Alignment. (3) Super-resolution-based Standardization.

Hive Dataset
Dataset treemap
Dataset composition across sources.
Mix distribution
Mixture type distribution (2-5 mix).
Label statistics
Label frequency statistics.
Performance & Efficiency
Separation results
Separation performance results.
Efficiency comparison
Efficiency comparison across models.

Demo Samples

Below we show inference results for different models on four types of mixture (2mix, 3mix, 4mix, 5mix).

For each mixture type, we present five test samples. AudioSep and FlowSep provide Hive-trained versions, selectable via the Model Weights dropdown next to each model.

Select model, mix type, and sample to view full comparison

Acknowledgements

Website template was borrowed from Colorful Image Colorization and Nerfies.