「cs.AI」カテゴリーアーカイブ

Prisma: An Open Source Toolkit for Mechanistic Interpretability in Vision and Video

投稿日: 2025年6月3日作成者: jarxiv

要約堅牢なツールと公開されている事前に訓練されたモデルは、言語モデルの機械的解 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

A Survey on Event-driven 3D Reconstruction: Development under Different Categories

投稿日: 2025年6月3日作成者: jarxiv

要約イベントカメラは、時間分解能が高い、遅延が低く、ダイナミックレンジが高いた … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.GR | コメントを受け付けていません

RePaViT: Scalable Vision Transformer Acceleration via Structural Reparameterization on Feedforward Network Layers

投稿日: 2025年6月3日作成者: jarxiv

要約注意層ではなく、Feedforwardネットワーク（FFN）レイヤーがVi … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

ARFlow: Human Action-Reaction Flow Matching with Physical Guidance

投稿日: 2025年6月3日作成者: jarxiv

要約因果的な人間の相互作用をモデル化する際の基本的な課題である人間の行動反応統 … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Jigsaw-R1: A Study of Rule-based Visual Reinforcement Learning with Jigsaw Puzzles

投稿日: 2025年6月3日作成者: jarxiv

要約ルールベースの強化学習（RL）をマルチモーダル大手言語モデル（MLLMS） … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

DIS-CO: Discovering Copyrighted Content in VLMs Training Data

投稿日: 2025年6月3日作成者: jarxiv

要約トレーニングデータに直接アクセスすることなく、著作権で保護されたコンテンツ … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG, I.2 | コメントを受け付けていません

Improving Medical Large Vision-Language Models with Abnormal-Aware Feedback

投稿日: 2025年6月3日作成者: jarxiv

要約既存の医療大規模視覚言語モデル（MED-LVLMS）は、広範な医療知識をカ … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.LG | コメントを受け付けていません

MSDNet: Multi-Scale Decoder for Few-Shot Semantic Segmentation via Transformer-Guided Prototyping

投稿日: 2025年6月3日作成者: jarxiv

要約少数のセマンティックセグメンテーションでは、クエリ画像のオブジェクトをセグ … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

A Conformal Risk Control Framework for Granular Word Assessment and Uncertainty Calibration of CLIPScore Quality Estimates

投稿日: 2025年6月3日作成者: jarxiv

要約この研究では、学習された画像キャプション評価メトリックの現在の制限、特にキ … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

TextDestroyer: A Training- and Annotation-Free Diffusion Method for Destroying Anomal Text from Images

投稿日: 2025年6月3日作成者: jarxiv

要約この論文では、事前に訓練された拡散モデルを使用したシーンテキスト破壊のため … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

「cs.AI」カテゴリーアーカイブ

Prisma: An Open Source Toolkit for Mechanistic Interpretability in Vision and Video

A Survey on Event-driven 3D Reconstruction: Development under Different Categories

RePaViT: Scalable Vision Transformer Acceleration via Structural Reparameterization on Feedforward Network Layers

ARFlow: Human Action-Reaction Flow Matching with Physical Guidance

Jigsaw-R1: A Study of Rule-based Visual Reinforcement Learning with Jigsaw Puzzles

DIS-CO: Discovering Copyrighted Content in VLMs Training Data

Improving Medical Large Vision-Language Models with Abnormal-Aware Feedback

MSDNet: Multi-Scale Decoder for Few-Shot Semantic Segmentation via Transformer-Guided Prototyping

A Conformal Risk Control Framework for Granular Word Assessment and Uncertainty Calibration of CLIPScore Quality Estimates

TextDestroyer: A Training- and Annotation-Free Diffusion Method for Destroying Anomal Text from Images

最近の投稿

最近のコメント

アーカイブ

カテゴリー