LogoRS Paper Hub

A curated collection of remote sensing papers from arXiv

Overview

RS Paper Hub is an automated system that collects, filters, and categorizes remote sensing papers from arXiv. It helps researchers stay updated with the latest developments in remote sensing, vision-language models, and AI agents.

The project provides multiple ways to access papers: a web interface, RSS feeds, and direct JSON downloads. All papers are automatically tagged with categories, tasks, and VLM-related keywords.

Features

Daily Automated Updates

Papers are fetched and processed automatically every day, ensuring you always have access to the latest research.

Task Tagging (11 types)

Each paper is automatically tagged with one or more task types: Classification, Retrieval, Geolocation, VQA, Captioning, Change Detection, Object Detection, Visual Grounding, Segmentation, Super-Resolution, and 3D Reconstruction.

Topic Filtering

Papers related to VLM, Agent, UAV, and SAR are automatically detected via keyword matching and can be browsed as separate data sources.

Trends & Statistics

Interactive trend dashboards showing yearly/monthly paper distributions and top author rankings across all data sources, with click-to-drill-down details.

Paper Classification

Papers are classified into Method, Survey, or Dataset categories using rule-based patterns.

Paper Collection & BibTeX Export

Collect favorite papers and export them as BibTeX for easy reference management.

RSS Feeds & Zotero

Subscribe to RSS feeds for all papers, VLM papers, or Agent papers. Integrate with Zotero for automatic bibliography management.

Clickable Tag Filtering

Click on any tag (date, type, category, task, VLM, etc.) on a paper card to instantly filter by that value. Multiple tags can be stacked for progressive filtering, and clicking the same tag again removes the filter.

Group Export & Sharing

Export selected papers as a Group JSON file with a custom name. Groups appear in the "All Groups" dropdown for quick filtering. Submit your group via PR, or set up auto-updated groups by author name — the pipeline keeps them in sync daily.

Skills Page

A curated collection of research skills — coding, writing, workflows and beyond. Browse community-contributed tools and methodologies to boost your research productivity.

Venues Page

Quick-reference directory of key journals and conferences in remote sensing and related AI/CV fields, organized by category with direct links to official sites.

Resources Page

Community-driven collection of remote sensing datasets and tools, organized by category. Contribute via Pull Request.

🛰️ Radar — Globe Word Cloud & Weekly Rankings

Interactive 3D globe word cloud of BERT-weighted keywords, paired with weekly hot paper rankings scored by keyword matching. Switch data sources and ranking modes on the fly.

Web Viewer Guide

The web interface provides powerful tools for exploring and filtering papers:

Four-Tab Data Source Switching

Use the tabs at the top to switch between: All Papers, UAV papers, Vision-Language Model papers, and Agent papers. The selection is preserved in the URL for sharing.

Search & Filtering

  • Search by title, author, or abstract with exact phrase or fuzzy (split-word) matching
  • Filter by year, month, paper type, category, VLM keywords, and task tags
  • Quick filters for "Today" and "This Week" to see recently published papers
  • Toggle "With Code" to show only papers with available code

Clickable Tag Filtering

  • Click any tag on a paper card (date, type, category, task, VLM keyword, etc.) to filter all papers by that value
  • Click different tags to stack filters progressively — e.g., click "Computer Vision" then "Classification" to find papers matching both
  • Click the same tag again to remove that filter; active tags are highlighted with a border

Charts & Visualization

Click "Charts" to view distribution visualizations. Click on bars to filter by that category. Multiple selections are supported.

Paper Actions

  • Click on paper title to open the arXiv page
  • Click on abstract to expand/collapse
  • Collect papers to your personal collection
  • Export filtered results or collection as BibTeX
  • View BibTeX citation or open in Google Scholar

Recent Papers Panel

Click the "NEW PAPERS" button on the right side to view papers published today or this week.

Skills & Venues Pages

  • Use the floating sidebar on the left (or bottom bar on mobile) to switch between Papers, Skills, and Venues pages
  • The Skills page features community-contributed research tools and methodologies for coding, writing, and more
  • The Venues page provides a categorized directory of key journals and conferences in remote sensing and AI/CV fields with direct links

Radar Page

  • Access via the Radar icon in the sidebar (between Papers and Trends); the page shows a 3D rotating globe word cloud and weekly hot paper rankings side by side
  • Click and drag the globe to rotate it manually; click any keyword to instantly search that term in the Papers page; colors are per-word for easy visual scanning
  • Switch between three ranking modes — Trend (recency-weighted keyword score), Frequency (raw keyword hit count), and Comprehensive (combined score) — and switch data sources (All, VLM, UAV, Agent, SAR) using the source tabs

RSS & Zotero

RS Paper Hub provides RSS feeds that can be used with any RSS reader or integrated with Zotero:

Available RSS Feeds

  • All Papers: output/feed.xml
  • VLM Papers: output/feed_vlm.xml
  • Agent Papers: output/feed_agent.xml
  • UAV Papers: output/feed_uav.xml
  • SAR Papers: output/feed_sar.xml

Using with Zotero

In Zotero, go to File → New Feed, and paste one of the RSS URLs above. Zotero will automatically fetch new papers as they are added.

Direct JSON Download

For programmatic access, JSON files are also available:

  • output/papers.json - All papers
  • output/papers_vlm.json - VLM papers
  • output/papers_agent.json - Agent papers
  • output/papers_uav.json - UAV papers
  • output/papers_sar.json - SAR papers

Pipeline

The data processing pipeline consists of 11 steps:

1
Fetch arXiv
2
Filter RS
3
Extract Metadata
4
Topic Detection
5
Task Tagging
6
Task Tagging
7
Classification
8
Generate Feeds
9
Generate Trends
10
Deploy
  • Fetch arXiv: Query the arXiv API for recent submissions in relevant categories (or use the web scraping fallback when the API is down)
  • Filter RS: Apply keyword-based filtering to identify remote sensing papers
  • Extract Metadata: Parse paper titles, abstracts, authors, and dates
  • Topic Detection: Use keyword matching to filter VLM, Agent, UAV, and SAR related papers
  • Task Tagging: Classify papers into 11 task categories based on content
  • Task Tagging: Classify papers into 11 task categories based on content
  • Classification: Categorize as Method, Survey, or Dataset
  • Generate Feeds: Create RSS feeds and JSON outputs
  • Deploy: Update the website with new data

CLI Reference

Run the pipeline using Python:

python main.py [--category CAT] [--max-results N] [--output DIR]

Arguments

ArgumentDescriptionDefault
--category, -carXiv category to search (e.g., cs.CV, eess.IV)cs.CV
--max-results, -mMaximum number of papers to fetch300
--output, -oOutput directory for JSON filesoutput/
--keywords, -kAdditional keywords for filtering (comma-separated)remote sensing
--vlm-keywordsJSON file with VLM keyword patternsconfig/vlm_keywords.json
--task-tagsJSON file with task tag patternsconfig/task_tags.json
--dry-runShow what would be done without making changesFalse
--verbose, -vEnable verbose loggingFalse

Examples

# Fetch latest papers
python main.py

# Fetch with custom category and limit
python main.py --category eess.IV --max-results 500

# Dry run to see what would be fetched
python main.py --dry-run

# Verbose output
python main.py --verbose

Web Scraping Fallback

The arXiv API can occasionally be unreachable. A web scraping fallback is provided that produces the same output format by scraping arXiv search pages directly.

Usage

# Use web scraper instead of API
bash run_daily.sh --web

# Or run directly
python main_web.py --update

# Look back 14 days instead of default 7
python main_web.py --update --days 14

# Limit fetch count
python main_web.py --update --max-results 50

How it works

  • web_scraper.py scrapes arxiv.org/search/ for "remote sensing" and "earth observation" papers
  • main_web.py handles incremental merging — existing papers are never modified, only new papers are appended
  • The downstream pipeline (pipeline.py) works identically regardless of which fetch method is used
  • All fields are guaranteed to have values (empty string instead of NaN/null for missing data)

Arguments

ArgumentDescriptionDefault
--updateQuick update (latest 7 days)off
--days NNumber of days to look back in update mode7
--max-results NMaximum number of papers to fetchunlimited
--with-codeQuery Papers With Code for code reposoff
-v, --verboseVerbose loggingoff

Output Schema

Each paper entry in the JSON output contains the following fields:

FieldTypeDescription
Paper_linkstringURL to the arXiv paper page
TitlestringPaper title
AuthorsstringComma-separated list of authors
AbstractstringPaper abstract (may contain LaTeX)
DatestringSubmission date (YYYY-MM-DD)
YearintegerPublication year
MonthintegerPublication month (1-12)
TypestringPaper type (e.g., cs.CV, eess.IV)
SubtypestringPaper subtype or arXiv ID
CategorystringClassification: Method, Survey, or Dataset
PublicationstringConference or journal name if published
codestringURL to code repository if available
BibTexstringBibTeX citation entry
_is_vlmbooleanWhether paper is VLM-related
_vlm_keywordsstringSemicolon-separated VLM keywords matched
_is_agentbooleanWhether paper is Agent-related
_tasksstringSemicolon-separated task tags

Task Tags

Papers are automatically tagged with one or more of the following 11 task types:

TagFull NameDescription
CLSClassificationClassifying remote sensing images into categories
ITRImage-Text RetrievalFinding images from text queries or vice versa
GeoLocGeolocationPredicting the geographic location of images
VQAVisual Question AnsweringAnswering questions about images
ICImage CaptioningGenerating text descriptions for images
CDChange DetectionIdentifying changes between multi-temporal images
ODObject DetectionDetecting and localizing objects in images
VGVisual GroundingLinking text phrases to image regions
SEGSegmentationSegmenting images at pixel or instance level
SRSuper-ResolutionEnhancing image resolution
3D3D ReconstructionReconstructing 3D models from images

Submit Group

You can create a custom paper group to curate and share a reading list with the community. Groups appear in the "All Groups" dropdown on the main page.

1. Prepare your paper list

Create a JSON file containing an array of arXiv paper links. You can use the "Export Group" button in the Export dialog (By author). Or create it manually — each link should be a full arXiv abstract URL:

[
  "http://arxiv.org/abs/2401.12345v1",
  "http://arxiv.org/abs/2401.67890v2"
]

2. Register your group

Add an entry to groups/index.json with your group's meta

[
  {
    "key": "my-group",
    "label": "My Research Group",
    "label_zh": "我的研究组",
    "file": "my-group.json"
  }
]
FieldDescription
keyUnique identifier for your group (use lowercase and hyphens)
labelDisplay name in English
label_zhDisplay name in Chinese (optional)
fileFilename of your paper list JSON (placed in the groups/ directory)

3. Submit a Pull Request

Fork the repository, add your JSON file to the groups/ directory, update groups/index.json, and submit a Pull Request. Once merged, your group will appear in the dropdown for all users.

Auto-updated groups by author

You can also create groups that automatically stay up to date. Add "auto": true and an "authors" array to your entry in groups/index.json. The pipeline will match papers by author name and update the group file daily.

{
  "key": "xian-sun",
  "label": "Xian Sun",
  "label_zh": "孙显",
  "file": "xian-sun.json",
  "auto": true,
  "authors": ["Xian Sun"]
}

Multiple author name variants are supported (e.g., ["Gui-Song Xia", "Guisong Xia"]). No need to create the JSON file manually — it will be generated automatically.

Radar Page

The Radar page combines a 3D globe word cloud with weekly hot paper rankings to give you a visual overview of trending topics across all data sources.

Globe Word Cloud

  • Keywords from wordcloud/keywords.json are distributed on a Fibonacci sphere and rendered as a 3D rotating globe
  • Each word has a unique color from a 20-color palette; back-facing words are desaturated for a natural depth effect
  • Latitude and longitude grid lines are drawn faintly on the globe surface for visual polish
  • Drag to rotate manually; click any keyword to apply it as a search filter in the Papers page

Weekly Hot Paper Rankings

Papers from the current week are scored by BERT-weighted keyword matching and ranked in three modes:

ModeDescription
TrendScore weighted by keyword recency (2025 × 2.5, 2024 × 2.0, 2023 × 1.2, 2022 × 0.8)
FrequencyRaw count of matched keywords across title and abstract
ComprehensiveCombined trend + frequency score for a balanced ranking

Source Switching

Use the source tabs (All Papers, VLM, UAV, Agent, SAR) to filter both the globe keywords and the weekly ranking to a specific topic area. Each source has its own keyword set and paper pool.

Submit Resource

You can contribute datasets, tools, or libraries to the Resources page. All resource data is stored as JSON files in the resources/ directory, making it easy to add or update via Pull Request.

Directory Structure

resources/
  index.json          # Category index
  scene-cls.json      # Scene classification datasets
  obj-det.json        # Object detection datasets
  segmentation.json   # Segmentation datasets
  change-det.json     # Change detection datasets
  tools.json          # Tools & libraries

1. Add to an existing category

Open the corresponding JSON file (e.g., resources/obj-det.json) and add a new entry with name, url, and desc:

{
  "name": "My Dataset",
  "url": "https://example.com",
  "desc": "A brief one-line description of the dataset"
}

2. Add a new category

Create a new JSON file in the resources/ directory, then register it in resources/index.json:

{
  "key": "vqa",
  "label": "Datasets — VQA",
  "label_zh": "数据集 — 视觉问答",
  "file": "vqa.json",
  "icon": "cube"
}

Available icons: cube, grid, layers, trending, wrench.

3. Submit a Pull Request

Fork the repository, make your changes in the resources/ directory, and submit a Pull Request. Once merged, your contribution will appear on the Resources page for all users.