NARAIM: Native Aspect Ratio Autoregressive Image Models
NeurIPS 2024
NARAIM introduces autoregressive vision transformers trained on images in their native aspect ratios, improving downstream classification accuracy by preserving spatial context and reducing distortions compared to square-resized inputs.
Daniel Gallo Fernández*, Robert van der Klis*, Răzvan Andrei Matișan*, Janusz Partyka*, Efstratios Gavves, Samuele Papa, Phillip Lippe
* Alphabetical order. Equal contribution.
Neural Information Processing Systems (NeurIPS) 2024
Workshop on Self-Supervised Learning Theory and Practice