月別アーカイブ: 2024年2月

ChartX & ChartVLM: A Versatile Benchmark and Foundation Model for Complicated Chart Reasoning

投稿日: 2024年2月20日作成者: jarxiv

要約最近、多くの汎用性の高いマルチモーダル大規模言語モデル (MLLM) が継 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Adversarial Feature Alignment: Balancing Robustness and Accuracy in Deep Learning via Adversarial Training

投稿日: 2024年2月20日作成者: jarxiv

要約深層学習モデルの精度は向上し続けていますが、依然として敵対的な攻撃に対して … 続きを読む →

カテゴリー: cs.CR, cs.CV, cs.LG, D.2.7 | コメントを受け付けていません

Pan-Mamba: Effective pan-sharpening with State Space Model

投稿日: 2024年2月20日作成者: jarxiv

要約パンシャープニングでは、低解像度のマルチスペクトル画像と高解像度のパンクロ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Zero shot VLMs for hate meme detection: Are we there yet?

投稿日: 2024年2月20日作成者: jarxiv

要約ソーシャルメディア上のマルチメディアコンテンツは急速に進化しており、ミ … 続きを読む →

カテゴリー: cs.CL, cs.CV, cs.LG | コメントを受け付けていません

LaneGraph2Seq: Lane Topology Extraction with Language Model via Vertex-Edge Encoding and Connectivity Enhancement

投稿日: 2024年2月20日作成者: jarxiv

要約自動運転には道路構造の理解が重要です。複雑な道路構造は、多くの場合、有向 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Pushing Auto-regressive Models for 3D Shape Generation at Capacity and Scalability

投稿日: 2024年2月20日作成者: jarxiv

要約自己回帰モデルは、グリッド空間内の結合分布をモデル化することにより、2D … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling

投稿日: 2024年2月20日作成者: jarxiv

要約 AnyGPT は、音声、テキスト、画像、音楽などのさまざまなモダリティの統 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.LG | コメントを受け付けていません

Task-Specific Normalization for Continual Learning of Blind Image Quality Models

投稿日: 2024年2月20日作成者: jarxiv

要約この論文では、品質予測精度、可塑性と安定性のトレードオフ、およびタスク順序 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Mixed Gaussian Flow for Diverse Trajectory Prediction

投稿日: 2024年2月20日作成者: jarxiv

要約既存の軌道予測研究は生成モデルを集中的に活用しています。フローの正規化は … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Open3DSG: Open-Vocabulary 3D Scene Graphs from Point Clouds with Queryable Objects and Open-Set Relationships

投稿日: 2024年2月20日作成者: jarxiv

要約 3D シーングラフ予測の現在のアプローチは、ラベル付きデータセットに依存 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

月別アーカイブ: 2024年2月

ChartX & ChartVLM: A Versatile Benchmark and Foundation Model for Complicated Chart Reasoning

Adversarial Feature Alignment: Balancing Robustness and Accuracy in Deep Learning via Adversarial Training

Pan-Mamba: Effective pan-sharpening with State Space Model

Zero shot VLMs for hate meme detection: Are we there yet?

LaneGraph2Seq: Lane Topology Extraction with Language Model via Vertex-Edge Encoding and Connectivity Enhancement

Pushing Auto-regressive Models for 3D Shape Generation at Capacity and Scalability

AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling

Task-Specific Normalization for Continual Learning of Blind Image Quality Models

Mixed Gaussian Flow for Diverse Trajectory Prediction

Open3DSG: Open-Vocabulary 3D Scene Graphs from Point Clouds with Queryable Objects and Open-Set Relationships

最近の投稿

最近のコメント

アーカイブ

カテゴリー