Overview

RS Paper Hub is an automated system that collects, filters, and categorizes remote sensing papers from arXiv. It helps researchers stay updated with the latest developments in remote sensing, vision-language models, and AI agents.

The project provides multiple ways to access papers: a web interface, RSS feeds, and direct JSON downloads. All papers are automatically tagged with categories, tasks, and VLM-related keywords.

Features

Daily Automated Updates

Papers are fetched and processed automatically every day, ensuring you always have access to the latest research.

Task Tagging (11 types)

Each paper is automatically tagged with one or more task types: Classification, Retrieval, Geolocation, VQA, Captioning, Change Detection, Object Detection, Visual Grounding, Segmentation, Super-Resolution, and 3D Reconstruction.

Topic Filtering

Papers related to VLM, Agent, UAV, and SAR are automatically detected via keyword matching and can be browsed as separate data sources.

Trends & Statistics

Interactive trend dashboards showing yearly/monthly paper distributions and top author rankings across all data sources, with click-to-drill-down details.

Paper Classification

Papers are classified into Method, Survey, or Dataset categories using rule-based patterns.

Paper Collection & BibTeX Export

Collect favorite papers and export them as BibTeX for easy reference management.

RSS Feeds & Zotero

Subscribe to RSS feeds for all papers, VLM papers, or Agent papers. Integrate with Zotero for automatic bibliography management.

Clickable Tag Filtering

Click on any tag (date, type, category, task, VLM, etc.) on a paper card to instantly filter by that value. Multiple tags can be stacked for progressive filtering, and clicking the same tag again removes the filter.

Group Export & Sharing

Export selected papers as a Group JSON file with a custom name. Groups appear in the "All Groups" dropdown for quick filtering. Submit your group via PR, or set up auto-updated groups by author name — the pipeline keeps them in sync daily.

Skills Page

A curated collection of research skills — coding, writing, workflows and beyond. Browse community-contributed tools and methodologies to boost your research productivity.

Venues Page

Quick-reference directory of key journals and conferences in remote sensing and related AI/CV fields, organized by category with direct links to official sites.

Resources Page

Community-driven collection of remote sensing datasets and tools, organized by category. Contribute via Pull Request.

🛰️ Radar — Globe Word Cloud & Weekly Rankings

Interactive 3D globe word cloud of BERT-weighted keywords, paired with weekly hot paper rankings scored by keyword matching. Switch data sources and ranking modes on the fly.

Web Viewer Guide

The web interface provides powerful tools for exploring and filtering papers:

Four-Tab Data Source Switching

Use the tabs at the top to switch between: All Papers, UAV papers, Vision-Language Model papers, and Agent papers. The selection is preserved in the URL for sharing.

Search & Filtering

Search by title, author, or abstract with exact phrase or fuzzy (split-word) matching
Filter by year, month, paper type, category, VLM keywords, and task tags
Quick filters for "Today" and "This Week" to see recently published papers
Toggle "With Code" to show only papers with available code

Clickable Tag Filtering

Click any tag on a paper card (date, type, category, task, VLM keyword, etc.) to filter all papers by that value
Click different tags to stack filters progressively — e.g., click "Computer Vision" then "Classification" to find papers matching both
Click the same tag again to remove that filter; active tags are highlighted with a border

Charts & Visualization

Click "Charts" to view distribution visualizations. Click on bars to filter by that category. Multiple selections are supported.

Paper Actions

Click on paper title to open the arXiv page
Click on abstract to expand/collapse
Collect papers to your personal collection
Export filtered results or collection as BibTeX
View BibTeX citation or open in Google Scholar

Recent Papers Panel

Click the "NEW PAPERS" button on the right side to view papers published today or this week.

Skills & Venues Pages

Use the floating sidebar on the left (or bottom bar on mobile) to switch between Papers, Skills, and Venues pages
The Skills page features community-contributed research tools and methodologies for coding, writing, and more
The Venues page provides a categorized directory of key journals and conferences in remote sensing and AI/CV fields with direct links

Radar Page

Access via the Radar icon in the sidebar (between Papers and Trends); the page shows a 3D rotating globe word cloud and weekly hot paper rankings side by side
Click and drag the globe to rotate it manually; click any keyword to instantly search that term in the Papers page; colors are per-word for easy visual scanning
Switch between three ranking modes — Trend (recency-weighted keyword score), Frequency (raw keyword hit count), and Comprehensive (combined score) — and switch data sources (All, VLM, UAV, Agent, SAR) using the source tabs

RSS & Zotero

RS Paper Hub provides RSS feeds that can be used with any RSS reader or integrated with Zotero:

Available RSS Feeds

All Papers: output/feed.xml
VLM Papers: output/feed_vlm.xml
Agent Papers: output/feed_agent.xml
UAV Papers: output/feed_uav.xml
SAR Papers: output/feed_sar.xml

Using with Zotero

In Zotero, go to File → New Feed, and paste one of the RSS URLs above. Zotero will automatically fetch new papers as they are added.

Direct JSON Download

For programmatic access, JSON files are also available:

output/papers.json - All papers
output/papers_vlm.json - VLM papers
output/papers_agent.json - Agent papers
output/papers_uav.json - UAV papers
output/papers_sar.json - SAR papers

Pipeline

The data processing pipeline consists of 11 steps:

Fetch arXiv

→

Filter RS

→

Extract Metadata

→

Topic Detection

→

Task Tagging

→

Task Tagging

→

Classification

→

Generate Feeds

→

Generate Trends

→

Deploy

Fetch arXiv: Query the arXiv API for recent submissions in relevant categories (or use the web scraping fallback when the API is down)
Filter RS: Apply keyword-based filtering to identify remote sensing papers
Extract Metadata: Parse paper titles, abstracts, authors, and dates
Topic Detection: Use keyword matching to filter VLM, Agent, UAV, and SAR related papers
Task Tagging: Classify papers into 11 task categories based on content
Task Tagging: Classify papers into 11 task categories based on content
Classification: Categorize as Method, Survey, or Dataset
Generate Feeds: Create RSS feeds and JSON outputs
Deploy: Update the website with new data

CLI Reference

Run the pipeline using Python:

python main.py [--category CAT] [--max-results N] [--output DIR]

Arguments

Argument	Description	Default
`--category`, `-c`	arXiv category to search (e.g., cs.CV, eess.IV)	cs.CV
`--max-results`, `-m`	Maximum number of papers to fetch	300
`--output`, `-o`	Output directory for JSON files	output/
`--keywords`, `-k`	Additional keywords for filtering (comma-separated)	remote sensing
`--vlm-keywords`	JSON file with VLM keyword patterns	config/vlm_keywords.json
`--task-tags`	JSON file with task tag patterns	config/task_tags.json
`--dry-run`	Show what would be done without making changes	False
`--verbose`, `-v`	Enable verbose logging	False

Examples

# Fetch latest papers
python main.py

# Fetch with custom category and limit
python main.py --category eess.IV --max-results 500

# Dry run to see what would be fetched
python main.py --dry-run

# Verbose output
python main.py --verbose

Web Scraping Fallback

The arXiv API can occasionally be unreachable. A web scraping fallback is provided that produces the same output format by scraping arXiv search pages directly.

Usage

# Use web scraper instead of API
bash run_daily.sh --web

# Or run directly
python main_web.py --update

# Look back 14 days instead of default 7
python main_web.py --update --days 14

# Limit fetch count
python main_web.py --update --max-results 50

How it works

web_scraper.py scrapes arxiv.org/search/ for "remote sensing" and "earth observation" papers
main_web.py handles incremental merging — existing papers are never modified, only new papers are appended
The downstream pipeline (pipeline.py) works identically regardless of which fetch method is used
All fields are guaranteed to have values (empty string instead of NaN/null for missing data)

Arguments

Argument	Description	Default
`--update`	Quick update (latest 7 days)	off
`--days N`	Number of days to look back in update mode	7
`--max-results N`	Maximum number of papers to fetch	unlimited
`--with-code`	Query Papers With Code for code repos	off
`-v, --verbose`	Verbose logging	off

Output Schema

Each paper entry in the JSON output contains the following fields:

Field	Type	Description
`Paper_link`	string	URL to the arXiv paper page
`Title`	string	Paper title
`Authors`	string	Comma-separated list of authors
`Abstract`	string	Paper abstract (may contain LaTeX)
`Date`	string	Submission date (YYYY-MM-DD)
`Year`	integer	Publication year
`Month`	integer	Publication month (1-12)
`Type`	string	Paper type (e.g., cs.CV, eess.IV)
`Subtype`	string	Paper subtype or arXiv ID
`Category`	string	Classification: Method, Survey, or Dataset
`Publication`	string	Conference or journal name if published
`code`	string	URL to code repository if available
`BibTex`	string	BibTeX citation entry
`_is_vlm`	boolean	Whether paper is VLM-related
`_vlm_keywords`	string	Semicolon-separated VLM keywords matched
`_is_agent`	boolean	Whether paper is Agent-related
`_tasks`	string	Semicolon-separated task tags

Task Tags

Papers are automatically tagged with one or more of the following 11 task types:

Tag	Full Name	Description
`CLS`	Classification	Classifying remote sensing images into categories
`ITR`	Image-Text Retrieval	Finding images from text queries or vice versa
`GeoLoc`	Geolocation	Predicting the geographic location of images
`VQA`	Visual Question Answering	Answering questions about images
`IC`	Image Captioning	Generating text descriptions for images
`CD`	Change Detection	Identifying changes between multi-temporal images
`OD`	Object Detection	Detecting and localizing objects in images
`VG`	Visual Grounding	Linking text phrases to image regions
`SEG`	Segmentation	Segmenting images at pixel or instance level
`SR`	Super-Resolution	Enhancing image resolution
`3D`	3D Reconstruction	Reconstructing 3D models from images

Submit Group

You can create a custom paper group to curate and share a reading list with the community. Groups appear in the "All Groups" dropdown on the main page.

1. Prepare your paper list

Create a JSON file containing an array of arXiv paper links. You can use the "Export Group" button in the Export dialog (By author). Or create it manually — each link should be a full arXiv abstract URL:

[
  "http://arxiv.org/abs/2401.12345v1",
  "http://arxiv.org/abs/2401.67890v2"
]

2. Register your group

Add an entry to groups/index.json with your group's meta

[
  {
    "key": "my-group",
    "label": "My Research Group",
    "label_zh": "我的研究组",
    "file": "my-group.json"
  }
]

Field	Description
`key`	Unique identifier for your group (use lowercase and hyphens)
`label`	Display name in English
`label_zh`	Display name in Chinese (optional)
`file`	Filename of your paper list JSON (placed in the `groups/` directory)

3. Submit a Pull Request

Fork the repository, add your JSON file to the groups/ directory, update groups/index.json, and submit a Pull Request. Once merged, your group will appear in the dropdown for all users.

Auto-updated groups by author

You can also create groups that automatically stay up to date. Add "auto": true and an "authors" array to your entry in groups/index.json. The pipeline will match papers by author name and update the group file daily.

{
  "key": "xian-sun",
  "label": "Xian Sun",
  "label_zh": "孙显",
  "file": "xian-sun.json",
  "auto": true,
  "authors": ["Xian Sun"]
}

Multiple author name variants are supported (e.g., ["Gui-Song Xia", "Guisong Xia"]). No need to create the JSON file manually — it will be generated automatically.

Radar Page

The Radar page combines a 3D globe word cloud with weekly hot paper rankings to give you a visual overview of trending topics across all data sources.

Globe Word Cloud

Keywords from wordcloud/keywords.json are distributed on a Fibonacci sphere and rendered as a 3D rotating globe
Each word has a unique color from a 20-color palette; back-facing words are desaturated for a natural depth effect
Latitude and longitude grid lines are drawn faintly on the globe surface for visual polish
Drag to rotate manually; click any keyword to apply it as a search filter in the Papers page

Weekly Hot Paper Rankings

Papers from the current week are scored by BERT-weighted keyword matching and ranked in three modes:

Mode	Description
Trend	Score weighted by keyword recency (2025 × 2.5, 2024 × 2.0, 2023 × 1.2, 2022 × 0.8)
Frequency	Raw count of matched keywords across title and abstract
Comprehensive	Combined trend + frequency score for a balanced ranking

Source Switching

Use the source tabs (All Papers, VLM, UAV, Agent, SAR) to filter both the globe keywords and the weekly ranking to a specific topic area. Each source has its own keyword set and paper pool.

Submit Resource

You can contribute datasets, tools, or libraries to the Resources page. All resource data is stored as JSON files in the resources/ directory, making it easy to add or update via Pull Request.

Directory Structure

resources/
  index.json          # Category index
  scene-cls.json      # Scene classification datasets
  obj-det.json        # Object detection datasets
  segmentation.json   # Segmentation datasets
  change-det.json     # Change detection datasets
  tools.json          # Tools & libraries

1. Add to an existing category

Open the corresponding JSON file (e.g., resources/obj-det.json) and add a new entry with name, url, and desc:

{
  "name": "My Dataset",
  "url": "https://example.com",
  "desc": "A brief one-line description of the dataset"
}

2. Add a new category

Create a new JSON file in the resources/ directory, then register it in resources/index.json:

{
  "key": "vqa",
  "label": "Datasets — VQA",
  "label_zh": "数据集 — 视觉问答",
  "file": "vqa.json",
  "icon": "cube"
}

Available icons: cube, grid, layers, trending, wrench.

3. Submit a Pull Request

Fork the repository, make your changes in the resources/ directory, and submit a Pull Request. Once merged, your contribution will appear on the Resources page for all users.