月別アーカイブ: 2025年3月

GSPR: Multimodal Place Recognition Using 3D Gaussian Splatting for Autonomous Driving

投稿日: 2025年3月7日作成者: jarxiv

要約場所の認識は、自律型の車両がGPS除去された環境でローカリゼーション結果を … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

ViT-VS: On the Applicability of Pretrained Vision Transformer Features for Generalizable Visual Servoing

投稿日: 2025年3月7日作成者: jarxiv

要約ビジュアルサーボにより、ロボットはターゲットオブジェクトに対してエンドエフ … 続きを読む →

カテゴリー: cs.CV, cs.RO | コメントを受け付けていません

X-Boundary: Establishing Exact Safety Boundary to Shield LLMs from Multi-Turn Jailbreaks without Compromising Usability

投稿日: 2025年3月7日作成者: jarxiv

要約 LLMSの安全アライメント技術の急速な発展にもかかわらず、マルチターンの脱 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CR, cs.CV, cs.LG | コメントを受け付けていません

Omnidirectional Multi-Object Tracking

投稿日: 2025年3月7日作成者: jarxiv

要約 360 {\ deg}の視野を備えたパノラマ画像は、周囲のオブジェクトの空 … 続きを読む →

カテゴリー: cs.CV, cs.RO, eess.IV | コメントを受け付けていません

A Benchmark for Multi-Lingual Vision-Language Learning in Remote Sensing Image Captioning

投稿日: 2025年3月7日作成者: jarxiv

要約リモートセンシング画像キャプション（RSIC）は、クロスモーダルフィールド … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Enhancing Multimodal Medical Image Classification using Cross-Graph Modal Contrastive Learning

投稿日: 2025年3月7日作成者: jarxiv

要約医療画像の分類は、疾患診断の極めて重要な側面であり、多くの場合、深い学習技 … 続きを読む →

カテゴリー: cs.CV, eess.IV | コメントを受け付けていません

The Best of Both Worlds: Integrating Language Models and Diffusion Models for Video Generation

投稿日: 2025年3月7日作成者: jarxiv

要約テキストからビデオへの最近の進歩（T2V）の生成は、自己回帰言語モデルと拡 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.LG | コメントを受け付けていません

Self-supervised pre-training with diffusion model for few-shot landmark detection in x-ray images

投稿日: 2025年3月7日作成者: jarxiv

要約深いニューラルネットワークは、画像分類、セグメンテーション、ランドマーク検 … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Human-Feedback Efficient Reinforcement Learning for Online Diffusion Model Finetuning

投稿日: 2025年3月7日作成者: jarxiv

要約安定した拡散（SD）微調整による制御可能な生成は、忠実度、安全性、および人 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.HC, cs.LG | コメントを受け付けていません

LLM-wrapper: Black-Box Semantic-Aware Adaptation of Vision-Language Models for Referring Expression Comprehension

投稿日: 2025年3月7日作成者: jarxiv

要約ビジョン言語モデル（VLMS）は、さまざまなオープンボキャブラリータスクで … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

月別アーカイブ: 2025年3月

GSPR: Multimodal Place Recognition Using 3D Gaussian Splatting for Autonomous Driving

ViT-VS: On the Applicability of Pretrained Vision Transformer Features for Generalizable Visual Servoing

X-Boundary: Establishing Exact Safety Boundary to Shield LLMs from Multi-Turn Jailbreaks without Compromising Usability

Omnidirectional Multi-Object Tracking

A Benchmark for Multi-Lingual Vision-Language Learning in Remote Sensing Image Captioning

Enhancing Multimodal Medical Image Classification using Cross-Graph Modal Contrastive Learning

The Best of Both Worlds: Integrating Language Models and Diffusion Models for Video Generation

Self-supervised pre-training with diffusion model for few-shot landmark detection in x-ray images

Human-Feedback Efficient Reinforcement Learning for Online Diffusion Model Finetuning

LLM-wrapper: Black-Box Semantic-Aware Adaptation of Vision-Language Models for Referring Expression Comprehension

最近の投稿

最近のコメント

アーカイブ

カテゴリー