SAR-W-MixMAE: Polarization‑Aware Self‑Supervised Pretraining for Masked Autoencoders on SAR Data

This repository contains the code for the paper “SAR-W-MixMAE: Polarization‑Aware Self‑Supervised Pretraining for Masked Autoencoders on SAR Data.”
It builds on MixMIM/MixMAE with a Swin backbone and adds polarization-aware reconstruction weighting.

How the weighting works (two variants):

From linear-scale VH/VV, we normalize each pixel to $[0,1]$ and compute $( w_c = \exp(1 - \tilde{x}_c) )$ for channel $( c \in { \mathrm{VH}, \mathrm{VV} } )$.
We then aggregate per-pixel weights to patch/token weights and scale the token-wise MSE.

Variants

Per-channel weighting: use $( w_{\mathrm{VH}} )$ for VH tokens and $( w_{\mathrm{VV}} )$ for VV tokens.
Shared-avg weighting: use a single map $( w = \tfrac{1}{2}(w_{\mathrm{VH}} + w_{\mathrm{VV}}) )$ for both channels.

Domain policy: inputs to the encoder are in dB, while weighting is computed in linear for consistency with the physical backscatter scale.

Highlights

Swin + MixMIM/MixMAE pretraining with mask ratio r = 0.5, input 2×128×128 (VH, VV).
Polarization‑aware pixel weights → aggregated token weights for reconstruction MSE.
Strong results on BigEarthNet v1/v2 (multi‑label) and SEN12‑FLOOD fine‑tuning.
Upstream files from MixMIM are redistributed with written permission (2025‑10‑15) and tagged SPDX: NOASSERTION; all original files here are MIT.

Repository Layout

SAR-W-MixMAE/
  main_pretrain.py            # modified from MixMIM (NOASSERTION)
  main_finetune.py            # modified from MixMIM (NOASSERTION)
  engine_pretrain.py          # modified from MixMIM (NOASSERTION)
  engine_finetune.py          # modified from MixMIM (NOASSERTION)
  models_mixmim.py            # modified from MixMIM (NOASSERTION)
  models_mixmim_ft.py         # modified from MixMIM (NOASSERTION)
  models_sen12_ft.py          # new, derived from MixMIM FT (NOASSERTION)

  util/                       # upstream MixMIM files (verbatim or lightly modified) — NOASSERTION
    pos_embed.py (verbatim), lr_sched.py (verbatim), lr_decay.py (verbatim),
    datasets.py (verbatim), crop.py (verbatim), misc.py (modified), README.md

  sarwmix/                    # all original utilities — MIT
    bigearthnetv1.py, bigearthnetv2.py, helper.py,
    sen12_align_s1_to_s2.py, sen12_data_prep.py, sen12_prune_partial_pairs.py,
    sen12flood_loader.py, weighting.py

  scripts/                    # all original runner scripts — MIT
    qsub_run.sh, run_local.sh # runners for ABCI/local
    prepare_sen12flood.sh     # SEN12FLOOD end-to-end prep (align→pair→clean)

  analysis/                   # log parsing, metrics, and figure helpers — MIT
    exponential_2_n_epoch.py
    exponential_2_n_training.py
    logfile_reader.py
    logfile_time_process.py
    train_data_sampling.py

  datasets/                   # CSVs/splits and metadata (e.g., S1list.json) — MIT
  LICENSES/                   # MIT.txt, NOASSERTION.txt
  NOTICE                      # provenance + permission note
  THIRD_PARTY.md              # file-by-file mapping table
  requirements.txt
  INSTALL.md
  README_benv1.md
  README_sen12.md
  README.md

Installation (Python 3.12, CUDA 12.x)

We recommend: install PyTorch first (matching your CUDA), then the rest of the Python deps.

# 1) Create env
conda create -n sarwmix python=3.12 -y
conda activate sarwmix

# 2) Install PyTorch (choose the right CUDA build)
# Example for CUDA 12.x:
pip3 install torch torchvision --index-url https://download.pytorch.org/whl/cu126

# 3) Install repo dependencies (Torch is intentionally excluded from requirements.txt)
pip install -r requirements.txt

If you build from source or use a system CUDA, verify torch.cuda.is_available() and that the CUDA versions match.

Datasets

BigEarthNet‑v2 (BENv2)

Input: Sentinel‑1 GRD; we use VH,VV channels, 2×128×128 patches.
You can load BENv2 with the provided dataset class:
```
from sarwmix.bigearthnetv2 import BigEarthNetv2
```
Expected usage: training splits via CSVs in datasets/.

BigEarthNet‑v1 (BENv1)

Legacy support; some loaders expect Zarr containers.
Use sarwmix/bigearthnetv1.py and train/val/test splits together with labels in datasets/.
Migration guide: see README_benv1.md.

SEN12-FLOOD (fine-tuning only)

what: binary flood detection on pairwise SAR (VV/VH) inputs. each sample has two timepoints
(img1 = non-flood, img2 = flood | non-flood), shaped (2 × 2 × 512 × 512). default --patch_op avg.
data prep: run scripts/prepare_sen12flood.sh (zip or raw-root). it performs:

unzip (if zip given) → 2) align S1→S2 grid → 3) build pairs → 4) clean partial-coverage pairs.
outputs: CURATED_SEN12FLOOD/{train,test} and prints class counts. requires gdal.
full guide: README_sen12.md.

normalization (dB, keep BENv1 stats)
mean = [-19.2309, -12.5951]
std = [ 3.1672, 2.9644]
checkpoints: finetune from BENv1 pretraining, e.g.
--finetune $MODELs_PATH/PRETr_CKPTs_LOGs/benv1_rand_pretrain_base/checkpoint_64.pth
See full migration guide: at README_sen12.md.

Training

Launcher: mpirun (OpenMPI) + NCCL backend. See the sections below for the exact commands we use on ABCI and on a local 2‑GPU machine. For all options, run:

python main_pretrain.py -h
python main_finetune.py -h

Self-supervised pretraining

Hardware / node

Cluster: ABCI (single node)
GPUs: 8 × NVIDIA H200 (script also handles V100=4 GPUs, A100=8)
CPU: #PBS -l select=1:ncpus=192

Distributed launch

We use MPI (mpirun) to spawn one Python process per GPU and NCCL for the backend.
GPU count and hosts are inferred from nvidia-smi and $PBS_NODEFILE.

Batch / schedule

Per-GPU batch: 256 → Global batch: 256 × 8 = 2048
Planned 1024 epochs with 40 warmup; we cut at epoch 64 (intentional cut) and released that checkpoint (checkpoint_64.pth). The finetuning and evaluation below use checkpoint_64.pth.

Note: The polarization‑aware weighting for reconstruction is used during pretraining.

Finetuning

Per-GPU batch 128 (global 128 × 8 GPUs = 1024), 50 epochs

Pretrained & Finetuned Checkpoints

Dataset / Task	File	SHA-256
BENv1 (pretrain, 64 ep)	`benv1_pretrain_checkpoint_64.pth`	`b4f385f96a1eef96c8b32768b57cee79060fe3d920c7273e2c70e43cc1c90700`
BENv1 → (finetune)	`benv1_finetune_checkpoint_best.pth`	`3778df30224903644d772d151a086d4a1bb006ab61c55ac6534ecb00fe3ae083`
BENv2 (pretrain, 64 ep)	`benv2_pretrain_checkpoint_64.pth`	`13ab3855a2faee9dc1ca0fb50ba483cc4e0d735b9fc8d1f03e75764c356d0b93`
BENv2 → (finetune)	`benv2_finetune_checkpoint_best.pth`	`9592399dcc5f34f608a4b558f897a919d4dd4feb06fc975c108e8f4191689c18`
SEN12-FLOOD (finetune)	`sen12_finetune_checkpoint_best.pth`	`33d06401a433191233322a24c551145e53622b6702b4fa9d6d0178cd2392c541`

Notes

These are .pth state dicts to reproduce the reported results.

For reproducibility, verify the SHA-256 hashes after download.

For SEN12FLOOD, we initialize from benv1_pretrain_checkpoint_64.pth. See README_sen12.md.

Verify Integrity

# verify all files listed in CHECKSUMS.txt (OK / FAILED)
sha256sum -c CHECKSUMS.txt --ignore-missing

# or single file verification as follows
shasum -a 256 benv1_finetune_checkpoint_best.pth

Citation

If you find this work useful, please cite the following papers.

@article{caglayan2026jstars,
  title     = {SAR-W-MixMAE: Polarization-Aware Self-Supervised Pretraining for Masked Autoencoders on SAR Data},
  author    = {Caglayan, Ali and Imamoglu, Nevrez and Kouyama, Toru},
  journal   = {IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing},
  volume    = {19},
  pages     = {5590-5601},
  year      = {2026},
  publisher = {IEEE},
  doi       = {10.1109/JSTARS.2026.3652404}
}

@inproceedings{caglayan2025igarss,
  title     = {SAR-W-MixMAE: SAR Foundation Model Training Using Backscatter Power Weighting},
  author    = {Caglayan, Ali and Imamoglu, Nevrez and Kouyama, Toru},
  month     = {August},
  year      = {2025},
  pages     = {265-269},
  booktitle = {IGARSS 2025 - 2025 IEEE International Geoscience and Remote Sensing Symposium},
}

License & Provenance

Original files in sarwmix/, scripts/, datasets/ → MIT (see LICENSES/MIT.txt).
Upstream MixMIM/MixMAE files (verbatim or modified) in util/ and selected top‑level main_*.py, engine_*.py, models_*.py → SPDX: NOASSERTION, redistributed with written permission (2025‑10‑15). See NOTICE and THIRD_PARTY.md.

See NOTICE for the permission note and THIRD_PARTY.md for the file‑by‑file mapping.

References (background only):

MAE (He et al.), MixMAE/MixMIM (Li et al.), Swin Transformer (Liu et al.), BEiT (Bao et al.). We cite these as prior work; our code is based only MixMIM here, under explicit permission. We credit all these works and thank the authors.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SAR-W-MixMAE: Polarization‑Aware Self‑Supervised Pretraining for Masked Autoencoders on SAR Data

Highlights

Repository Layout

Installation (Python 3.12, CUDA 12.x)

Datasets

BigEarthNet‑v2 (BENv2)

BigEarthNet‑v1 (BENv1)

SEN12-FLOOD (fine-tuning only)

Training

Self-supervised pretraining

Finetuning

Pretrained & Finetuned Checkpoints

Verify Integrity

Citation

License & Provenance

References (background only):

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 75 Commits
LICENSES		LICENSES
analysis		analysis
datasets		datasets
figures		figures
sarwmix		sarwmix
scripts		scripts
util		util
INSTALL.md		INSTALL.md
NOTICE		NOTICE
README.md		README.md
README_benv1.md		README_benv1.md
README_sen12.md		README_sen12.md
THIRD_PARTY.md		THIRD_PARTY.md
engine_finetune.py		engine_finetune.py
engine_pretrain.py		engine_pretrain.py
main_finetune.py		main_finetune.py
main_pretrain.py		main_pretrain.py
models_mixmim.py		models_mixmim.py
models_mixmim_ft.py		models_mixmim_ft.py
models_sen12_ft.py		models_sen12_ft.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

SAR-W-MixMAE: Polarization‑Aware Self‑Supervised Pretraining for Masked Autoencoders on SAR Data

Highlights

Repository Layout

Installation (Python 3.12, CUDA 12.x)

Datasets

BigEarthNet‑v2 (BENv2)

BigEarthNet‑v1 (BENv1)

SEN12-FLOOD (fine-tuning only)

Training

Self-supervised pretraining

Finetuning

Pretrained & Finetuned Checkpoints

Verify Integrity

Citation

License & Provenance

References (background only):

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages