月別アーカイブ: 2024年8月

Cross Psuedo Supervision Framework for Sparsely Labelled Geo-spatial Images

投稿日: 2024年8月6日作成者: jarxiv

要約土地利用土地被覆 (LULC) マッピングは都市計画と資源計画に不可欠であ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Open Sesame! Universal Black Box Jailbreaking of Large Language Models

投稿日: 2024年8月6日作成者: jarxiv

要約役立つ安全な応答を提供するように設計された大規模言語モデル (LLM) は … 続きを読む →

カテゴリー: cs.CL, cs.CV, cs.NE | コメントを受け付けていません

MaFreeI2P: A Matching-Free Image-to-Point Cloud Registration Paradigm with Active Camera Pose Retrieval

投稿日: 2024年8月6日作成者: jarxiv

要約画像から点群への登録では、相対的なカメラの姿勢を推定しようとしますが、デー … 続きを読む →

カテゴリー: cs.CV, eess.IV | コメントを受け付けていません

CMR-Agent: Learning a Cross-Modal Agent for Iterative Image-to-Point Cloud Registration

投稿日: 2024年8月6日作成者: jarxiv

要約画像から点群への登録は、点群に対する RGB 画像の相対的なカメラ姿勢を決 … 続きを読む →

カテゴリー: cs.CV, cs.RO | コメントを受け付けていません

Tensorial template matching for fast cross-correlation with rotations and its application for tomography

投稿日: 2024年8月6日作成者: jarxiv

要約オブジェクトの検出は、コンピュータービジョンの主要なタスクです。テンプ … 続きを読む →

カテゴリー: cs.CV, I.4.9, q-bio.QM | コメントを受け付けていません

Multi-weather Cross-view Geo-localization Using Denoising Diffusion Models

投稿日: 2024年8月6日作成者: jarxiv

要約 GNSS が拒否された環境におけるクロスビュー地理位置特定は、ドローンから … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Solving Token Gradient Conflict in Mixture-of-Experts for Large Vision-Language Model

投稿日: 2024年8月6日作成者: jarxiv

要約 Mixture-of-Experts (MoE) は、Large Visi … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Zero shot VLMs for hate meme detection: Are we there yet?

投稿日: 2024年8月6日作成者: jarxiv

要約ソーシャルメディア上のマルチメディアコンテンツは急速に進化しており、ミ … 続きを読む →

カテゴリー: cs.CL, cs.CV, cs.LG | コメントを受け付けていません

FE-Adapter: Adapting Image-based Emotion Classifiers to Videos

投稿日: 2024年8月6日作成者: jarxiv

要約特定のタスクに大規模な事前トレーニング済みモデルを利用することで、素晴らし … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Revolutionizing Urban Safety Perception Assessments: Integrating Multimodal Large Language Models with Street View Images

投稿日: 2024年8月6日作成者: jarxiv

要約都市の安全認識を測定することは重要かつ複雑なタスクであり、従来は人的資源に … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

月別アーカイブ: 2024年8月

Cross Psuedo Supervision Framework for Sparsely Labelled Geo-spatial Images

Open Sesame! Universal Black Box Jailbreaking of Large Language Models

MaFreeI2P: A Matching-Free Image-to-Point Cloud Registration Paradigm with Active Camera Pose Retrieval

CMR-Agent: Learning a Cross-Modal Agent for Iterative Image-to-Point Cloud Registration

Tensorial template matching for fast cross-correlation with rotations and its application for tomography

Multi-weather Cross-view Geo-localization Using Denoising Diffusion Models

Solving Token Gradient Conflict in Mixture-of-Experts for Large Vision-Language Model

Zero shot VLMs for hate meme detection: Are we there yet?

FE-Adapter: Adapting Image-based Emotion Classifiers to Videos

Revolutionizing Urban Safety Perception Assessments: Integrating Multimodal Large Language Models with Street View Images

最近の投稿

最近のコメント

アーカイブ

カテゴリー