月別アーカイブ: 2024年8月

MathScape: Evaluating MLLMs in multimodal Math Scenarios through a Hierarchical Benchmark

投稿日: 2024年8月15日作成者: jarxiv

要約マルチモーダル大規模言語モデル (MLLM) の開発により、数学的問題に関 … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

R2Human: Real-Time 3D Human Appearance Rendering from a Single Image

投稿日: 2024年8月15日作成者: jarxiv

要約単一の画像から 3D 人間の外観をリアルタイムでレンダリングすることは、ホ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

GS-Pose: Generalizable Segmentation-based 6D Object Pose Estimation with 3D Gaussian Splatting

投稿日: 2024年8月15日作成者: jarxiv

要約この論文では、新しいオブジェクトの 6D 姿勢を位置特定および推定するため … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Sonic: Fast and Transferable Data Poisoning on Clustering Algorithms

投稿日: 2024年8月15日作成者: jarxiv

要約クラスタリングアルゴリズムに対するデータポイズニング攻撃はあまり注目さ … 続きを読む →

カテゴリー: cs.CR, cs.CV, cs.LG | コメントを受け付けていません

Disentangled Representation Learning with Transmitted Information Bottleneck

投稿日: 2024年8月15日作成者: jarxiv

要約生データからタスク関連情報のみをエンコードすること、つまり、もつれを解いた … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

MetaSeg: MetaFormer-based Global Contexts-aware Network for Efficient Semantic Segmentation

投稿日: 2024年8月15日作成者: jarxiv

要約 Transformer を超えて、Transformer のパフォーマンス … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Transformers and Large Language Models for Efficient Intrusion Detection Systems: A Comprehensive Survey

投稿日: 2024年8月15日作成者: jarxiv

要約 Transformers LLM の大幅な進歩により、NLP はテキスト生 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CR, cs.CV, eess.AS | コメントを受け付けていません

InternVideo2: Scaling Foundation Models for Multimodal Video Understanding

投稿日: 2024年8月15日作成者: jarxiv

要約ビデオ認識、ビデオテキストタスク、およびビデオ中心の対話において最先端 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

DeepFace-Attention: Multimodal Face Biometrics for Attention Estimation with Application to e-Learning

投稿日: 2024年8月15日作成者: jarxiv

要約この研究では、ウェブカメラのビデオに適用された一連の顔分析技術を使用して、 … 続きを読む →

カテゴリー: cs.CV, cs.HC | コメントを受け付けていません

Progressive Radiance Distillation for Inverse Rendering with Gaussian Splatting

投稿日: 2024年8月15日作成者: jarxiv

要約我々は、蒸留進行マップを使用して物理ベースのレンダリングとガウスベースの放 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

月別アーカイブ: 2024年8月

MathScape: Evaluating MLLMs in multimodal Math Scenarios through a Hierarchical Benchmark

R2Human: Real-Time 3D Human Appearance Rendering from a Single Image

GS-Pose: Generalizable Segmentation-based 6D Object Pose Estimation with 3D Gaussian Splatting

Sonic: Fast and Transferable Data Poisoning on Clustering Algorithms

Disentangled Representation Learning with Transmitted Information Bottleneck

MetaSeg: MetaFormer-based Global Contexts-aware Network for Efficient Semantic Segmentation

Transformers and Large Language Models for Efficient Intrusion Detection Systems: A Comprehensive Survey

InternVideo2: Scaling Foundation Models for Multimodal Video Understanding

DeepFace-Attention: Multimodal Face Biometrics for Attention Estimation with Application to e-Learning

Progressive Radiance Distillation for Inverse Rendering with Gaussian Splatting

最近の投稿

最近のコメント

アーカイブ

カテゴリー