<?xml version='1.0' encoding='utf-8'?>
<feed xmlns="http://www.w3.org/2005/Atom">
  <title>RS-Paper-Hub — SAR Papers</title>
  <id>https://rspaper.top/output/feed_sar.xml</id>
  <link href="https://rspaper.top/output/feed_sar.xml" rel="self" type="application/atom+xml" />
  <link href="https://rspaper.top" rel="alternate" type="text/html" />
  <updated>2026-05-18T02:08:55Z</updated>
  <subtitle>Latest remote sensing papers (last 7 days) — 2 entries</subtitle>
  <author>
    <name>RS-Paper-Hub</name>
    <uri>https://rspaper.top</uri>
  </author>
  <entry>
    <title>Can LLM Agents Respond to Disasters? Benchmarking Heterogeneous Geospatial Reasoning in Emergency Operations</title>
    <link href="http://arxiv.org/abs/2605.11633v1" rel="alternate" type="text/html" />
    <id>http://arxiv.org/abs/2605.11633v1</id>
    <published>2026-05-12T00:00:00Z</published>
    <updated>2026-05-12T00:00:00Z</updated>
    <author>
      <name>Junjue Wang</name>
    </author>
    <author>
      <name>Weihao Xuan</name>
    </author>
    <author>
      <name>Heli Qi</name>
    </author>
    <author>
      <name>Pengyu Dai</name>
    </author>
    <author>
      <name>Kunyi Liu</name>
    </author>
    <author>
      <name>Hongruixuan Chen</name>
    </author>
    <author>
      <name>Zhuo Zheng</name>
    </author>
    <author>
      <name>Junshi Xia</name>
    </author>
    <author>
      <name>Stefano Ermon</name>
    </author>
    <author>
      <name>Naoto Yokoya</name>
    </author>
    <summary type="text">Operational disaster response goes beyond damage assessment, requiring responders to integrate multi-sensor signals, reason over road networks, populations and key facilities, plan evacuations, and produce actionable reports. However, prior work largely isolates remote-sensing perception or evaluates generic tool use, leaving the end-to-end workflows of emergency operations underexplored. In this paper, we introduce Disaster Operational Response Agent benchmark (DORA), the first agentic benchmark for end-to-end disaster response: 515 expert-authored tasks across 45 real-world disaster events spanning 10 types, paired with expert-verified, replayable gold trajectories totaling 3,500 tool-call steps. Tasks span five dimensions that cover the operational disaster-response pipeline: disaster perception, spatial relational analysis, rescue and evacuation planning, temporal evolution reasoning, and multi-modal report synthesis. Agents compose calls from a 108-tool MCP library over heterogeneous geospatial data: optical, SAR, and multi-spectral imagery across single-, bi-, and multi-temporal sequences (0.015-10m GSD), complemented by elevation and social vector layers. We comprehensively evaluate 13 frontier LLMs on our benchmark, revealing three persistent challenges: 1) disaster-domain grounding exposes unique failure modes (damage-semantic grounding, sensor-modality mismatch, and disaster-pipeline composition); 2) agents are doubly bottlenecked by tool selection and argument grounding, where gold tool-order hints improve accuracy by only 1.08-4.40%, and alternative scaffolds yield at most a 3.24% gain; 3) compositional fragility scales with trajectory length, the agent-to-gold gap widening from 7% to 56% on long pipelines. DORA establishes a rigorous testbed for operationally reliable disaster-response agents.</summary>
    <content type="html">&lt;p&gt;&lt;strong&gt;Publication:&lt;/strong&gt; DORA stress-tests LLM agents on real-world disaster operations that demand comprehensive orchestration of 108 specialized tools over heterogeneous geospatial data&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Category:&lt;/strong&gt; Method&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Tasks:&lt;/strong&gt; VG&lt;/p&gt;</content>
    <category term="Artificial Intelligence" />
  </entry>
  <entry>
    <title>TAR: Text Semantic Assisted Cross-modal Image Registration Framework for Optical and SAR Images</title>
    <link href="http://arxiv.org/abs/2605.12064v1" rel="alternate" type="text/html" />
    <id>http://arxiv.org/abs/2605.12064v1</id>
    <published>2026-05-12T00:00:00Z</published>
    <updated>2026-05-12T00:00:00Z</updated>
    <author>
      <name>Zhuoyu Cai</name>
    </author>
    <author>
      <name>Dou Quan</name>
    </author>
    <author>
      <name>Ning Huyan</name>
    </author>
    <author>
      <name>Pei He</name>
    </author>
    <author>
      <name>Shuang Wang</name>
    </author>
    <author>
      <name>Licheng Jiao</name>
    </author>
    <summary type="text">Existing deep learning-based methods can capture shared features from optical and synthetic aperture radar (SAR) images for spatial alignment. However, optical-SAR registration remains challenging under large geometric deformations, because the model needs to simultaneously handle cross-modal appearance discrepancies and complex spatial transformations. To address this issue, this paper proposes a text semantic-assisted cross-modal image registration framework, named TAR, for optical and SAR images. TAR exploits text semantic priors from remote sensing scenes and land-cover categories to alleviate the modality gap and enhance cross-modal feature learning. TAR consists of three components: a multi-scale visual feature learning (MSFL) module, a text-assisted feature enhancement (TAFE) module, and a coarse-to-fine dense matching (CFDM) module. MSFL extracts multi-scale visual features from optical and SAR images. TAFE constructs text descriptors related to remote sensing scenes and land-cover objects, and uses a frozen RemoteCLIP text encoder to extract text features. These text features are introduced through visual-text interaction to enhance high-level visual features for more reliable coarse matching. CFDM then establishes coarse correspondences based on the enhanced high-level features and refines the matched locations using low-level features. Experimental results on cross-modal remote sensing images demonstrate the effectiveness of TAR, which achieves stronger matching performance than several state-of-the-art methods and yields significant gains under large geometric deformations.</summary>
    <content type="html">&lt;p&gt;&lt;strong&gt;Category:&lt;/strong&gt; Method&lt;/p&gt;</content>
    <category term="Computer Vision" />
  </entry>
</feed>