<?xml version='1.0' encoding='utf-8'?>
<feed xmlns="http://www.w3.org/2005/Atom">
  <title>RS-Paper-Hub — Hyperspectral/MS Papers</title>
  <id>https://rspaper.top/output/feed_hyp.xml</id>
  <link href="https://rspaper.top/output/feed_hyp.xml" rel="self" type="application/atom+xml" />
  <link href="https://rspaper.top" rel="alternate" type="text/html" />
  <updated>2026-05-29T05:18:25Z</updated>
  <subtitle>Latest remote sensing papers (last 7 days) — 4 entries</subtitle>
  <author>
    <name>RS-Paper-Hub</name>
    <uri>https://rspaper.top</uri>
  </author>
  <entry>
    <title>FLORO: A Multimodal Geospatial Foundation Model for Ecological Remote Sensing Across Sensors and Scales</title>
    <link href="http://arxiv.org/abs/2605.28174v1" rel="alternate" type="text/html" />
    <id>http://arxiv.org/abs/2605.28174v1</id>
    <published>2026-05-27T00:00:00Z</published>
    <updated>2026-05-27T00:00:00Z</updated>
    <author>
      <name>Jorge L. Rodriguez</name>
    </author>
    <author>
      <name>Victor Angulo Morales</name>
    </author>
    <author>
      <name>Areej Alwahas</name>
    </author>
    <author>
      <name>Mariana Elias Lara</name>
    </author>
    <author>
      <name>Fida Mohammad Thoker</name>
    </author>
    <author>
      <name>Kasper Johansen</name>
    </author>
    <author>
      <name>Bernard Ghanem</name>
    </author>
    <author>
      <name>Fernando T. Maestre</name>
    </author>
    <author>
      <name>Matthew F. McCabe</name>
    </author>
    <summary type="text">Foundation models offer a promising route to transferable remote sensing representations, but many current approaches depend on very large pretraining datasets and fixed sensor configurations, limiting their suitability for ecological and environmental applications, where observations often vary across platforms, spatial and spectral resolutions, and available modalities. We introduce FLORO, a multimodal geospatial foundation model designed to learn transferable representations from a small but highly diverse remote sensing corpus. FLORO is pretrained using masked autoencoding on a heterogeneous combination of Sentinel-1, Sentinel-2, SkySAT imagery, elevation, and UAV-derived data. To accommodate sensor variability, FLORO incorporates availability-aware inputs that indicate which spectral bands and auxiliary modalities are present in each sample, enabling a unified input space across heterogeneous sensor configurations. We evaluated FLORO on the PANGAEA benchmark under a frozen-encoder protocol across scene classification, segmentation, and regression tasks. Despite being pretrained on a smaller corpus than competing foundation models, FLORO achieved strong and stable transfer across optical, optical-SAR, and optical-elevation benchmarks spanning medium-resolution satellite, airborne, and ultra-high-resolution UAV imagery. FLORO obtained the second-best average segmentation performance across six PANGAEA benchmarks, trailing only a recently introduced foundation model pretrained on over two orders of magnitude more images, remained competitive on scene classification, and was robust in regression tasks, while qualitative results showed improved preservation of spatial structure in flood, urban, biomass, and canopy-height prediction settings. In a separate controlled experiment on EuroSAT-MS, geo-positional encoding further improved classification relative to absolute positional encoding.</summary>
    <content type="html">&lt;p&gt;&lt;strong&gt;Publication:&lt;/strong&gt; 29 pages, 9 figures&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Category:&lt;/strong&gt; Method&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Tasks:&lt;/strong&gt; CLS&lt;/p&gt;</content>
    <category term="Computer Vision" />
    <category term="Artificial Intelligence" />
  </entry>
  <entry>
    <title>Asynchronous Remote Sensing Time-Series Fusion for Cloud Removal and Anytime Reconstruction</title>
    <link href="http://arxiv.org/abs/2605.27726v1" rel="alternate" type="text/html" />
    <id>http://arxiv.org/abs/2605.27726v1</id>
    <published>2026-05-26T00:00:00Z</published>
    <updated>2026-05-26T00:00:00Z</updated>
    <author>
      <name>Forouzan Fallah</name>
    </author>
    <author>
      <name>Chia Yu Hsu</name>
    </author>
    <author>
      <name>Wenwen Li</name>
    </author>
    <author>
      <name>Anna Liljedahl</name>
    </author>
    <author>
      <name>Yezhou Yang</name>
    </author>
    <summary type="text">Frequent cloud cover severely limits the usability of Sentinel-2 (S2) optical time series for Earth surface monitoring. Sentinel-1 (S1) SAR provides all-weather complementary observations, but practical S1/S2 fusion remains difficult because acquisitions are irregular and asynchronous. Many existing approaches assume temporally aligned inputs (or require external nearest-date matching) and typically restore only observed timestamps, limiting reconstruction under long gaps and preventing on-demand synthesis. We propose AGFlow (Time Aligned Generative Flow Matching), a spatiotemporal flow-matching model for S1/S2 cloud removal and time-series reconstruction with three capabilities: (1) timestamp-conditioned internal alignment that fuses asynchronous S1 and cloudy S2 observations without preprocessing-based pairing; (2) spatiotemporal, context-aware denoising that models spatial structure jointly with temporal dynamics (rather than independent per-pixel time series); and (3) anytime querying, enabling generation of cloud-free S2 frames at both observed and user-specified timestamps within the monitoring window. We evaluate on the RESTORE-DiT benchmark protocol with quantitative metrics, qualitative comparisons, and component ablations. AGFlow notably improves fully missing-frame reconstruction (MAE and RMSE reduce by 16-19% over RESTORE-DiT) and provides reliable reconstructions under persistent gaps, while also yielding competitive cloud removal performance and flexible temporal querying for downstream tasks such as dense vegetation monitoring.</summary>
    <content type="html">&lt;p&gt;&lt;strong&gt;Publication:&lt;/strong&gt; CVPR 2026&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Category:&lt;/strong&gt; Method&lt;/p&gt;</content>
    <category term="Computer Vision" />
  </entry>
  <entry>
    <title>Location Prior Generation via Multi-Source Urban Data Fusion for Low-Altitude Air Mobility</title>
    <link href="http://arxiv.org/abs/2605.25530v1" rel="alternate" type="text/html" />
    <id>http://arxiv.org/abs/2605.25530v1</id>
    <published>2026-05-25T00:00:00Z</published>
    <updated>2026-05-25T00:00:00Z</updated>
    <author>
      <name>Xiang Xie</name>
    </author>
    <author>
      <name>Xiaonan Liu</name>
    </author>
    <summary type="text">Building height, the third dimension (3D) of urban spatial data, is absent in over 95% of structures in global geospatial databases. For the emerging low-altitude economy, this data gap forces each aerial platform to rely on real-time onboard sensing rather than pre-computed 3D scene geometry. We present the Location Prior Generation Framework (LPGF), a multi-source data fusion pipeline that integrates Sentinel-2 imagery, UAV telemetry, vehicle GPS trajectories, and OpenStreetMap footprints into structured, reusable urban location priors. LPGF assigns building heights through a three-tier priority hierarchy: (1) explicit OSM height tags where available, (2) floor count multiplied by 3.2 m per story where recorded, and (3) building-type default heights otherwise, yielding a worst-case error of approximately 5.5 m. An optional shadow-based height estimation module (SHEM) is activated only when a four-criterion quality gate is satisfied; when any criterion fails, the pipeline routes to structured fallback. On the MiTra A50 Milan dataset, the quality gate correctly identified two imaging failure modes: sub-pixel shadows at 10 m GSD and ground shadow merging at 0.93 m GSD, producing a consistent 27-building prior in both cases. Tier 3 type-default heights were validated against manual floor counts (n=15), achieving MAE=3.07 m within the 5.0 m uncertainty bound. The framework demonstrates that structured, quality-gated fusion of universally available data streams can bootstrap 3D scene coverage for low-altitude urban operations.</summary>
    <content type="html">&lt;p&gt;&lt;strong&gt;Publication:&lt;/strong&gt; 11 pages, 7 figures&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Category:&lt;/strong&gt; Method&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Tasks:&lt;/strong&gt; 3D&lt;/p&gt;</content>
    <category term="Computer Vision" />
  </entry>
  <entry>
    <title>Coarse-to-Fine Domain Incremental Learning with Attentive Distillation for Mining Footprint Segmentation in Multispectral Imagery</title>
    <link href="http://arxiv.org/abs/2605.24460v2" rel="alternate" type="text/html" />
    <id>http://arxiv.org/abs/2605.24460v2</id>
    <published>2026-05-23T00:00:00Z</published>
    <updated>2026-05-23T00:00:00Z</updated>
    <author>
      <name>Alif Tri Handoyo</name>
    </author>
    <author>
      <name>Vincent C. S. Lee</name>
    </author>
    <author>
      <name>Rizka Widyarini Purwanto</name>
    </author>
    <author>
      <name>Alex M. Lechner</name>
    </author>
    <author>
      <name>Deanna Kemp</name>
    </author>
    <author>
      <name>Muhamad Risqi U. Saputra</name>
    </author>
    <summary type="text">Automatically mapping and segmenting global mining footprints using remote sensing and deep learning is critical for monitoring the socio-environmental risks and impacts of mining, yet its progress is hindered by the scarcity of fine-grained annotated data. Although large-scale datasets with coarse boundaries are widely available, leveraging them to improve fine-grained segmentation is challenging due to significant domain shift. To address this, we propose MineC2FNet, a coarse-to-fine domain incremental learning framework that exploits abundant coarse data to enhance fine-grained mining footprint segmentation. MineC2FNet adopts a teacher-student architecture with attentive distillation at both the feature and prediction levels, selectively transferring generalized knowledge from the coarse domain while enabling boundary refinement using limited fine-grained data (fine domain). We further introduce an expertly validated dataset of 219 images with precise boundary annotations across diverse geographies and commodities. Extensive experiments against state-of-the-art approaches, including domain adaptation and domain incremental learning methods, demonstrate that MineC2FNet achieves superior performance while effectively handling domain shift. The dataset and code are publicly available at https://github.com/risqiutama/MineC2FNet.</summary>
    <content type="html">&lt;p&gt;&lt;strong&gt;Code:&lt;/strong&gt; &lt;a href="https://github.com/risqiutama/MineC2FNet"&gt;https://github.com/risqiutama/MineC2FNet&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Category:&lt;/strong&gt; Method&lt;/p&gt;</content>
    <category term="Computer Vision" />
    <category term="Artificial Intelligence" />
  </entry>
</feed>