Publication
AerialFormer: Multi-resolution Transformer for Aerial Image Segmentation
A multi-resolution Transformer for semantic segmentation of aerial imagery.
Abstract
Aerial image segmentation is top-down semantic segmentation with challenges such as foreground-background imbalance, complex backgrounds, intra-class heterogeneity, inter-class homogeneity, and tiny objects.
AerialFormer unifies Transformer-based multi-scale features at the contracting path with lightweight multi-dilated convolutional neural networks at the expanding path. The design combines local and global context for high-resolution segmentation.