月別アーカイブ: 2024年2月

MLLM-as-a-Judge: Assessing Multimodal LLM-as-a-Judge with Vision-Language Benchmark

投稿日: 2024年2月8日作成者: jarxiv

要約マルチモーダル大規模言語モデル (MLLM) は最近大きな注目を集めており … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

Mesh-based Gaussian Splatting for Real-time Large-scale Deformation

投稿日: 2024年2月8日作成者: jarxiv

要約ニューラルディスタンスフィールドやニューラルラディアンスフィールド … 続きを読む →

カテゴリー: cs.CV, cs.GR | コメントを受け付けていません

Spiking-PhysFormer: Camera-Based Remote Photoplethysmography with Parallel Spike-driven Transformer

投稿日: 2024年2月8日作成者: jarxiv

要約人工ニューラルネットワーク (ANN) は、カメラベースの遠隔光電脈波計 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

iDeLog: Iterative Dual Spatial and Kinematic Extraction of Sigma-Lognormal Parameters

投稿日: 2024年2月8日作成者: jarxiv

要約素早い動きの運動理論とそれに関連するシグマ対数正規モデルは、さまざまな用途 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Domain Adaptation based Interpretable Image Emotion Recognition using Facial Expression Recognition

投稿日: 2024年2月8日作成者: jarxiv

要約この論文では、顔および顔以外のオブジェクト、および人間以外のコンポーネント … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

NeRF as Non-Distant Environment Emitter in Physics-based Inverse Rendering

投稿日: 2024年2月8日作成者: jarxiv

要約物理ベースの逆レンダリングは、キャプチャされた 2D 画像から形状、マテリ … 続きを読む →

カテゴリー: cs.CV, cs.GR | コメントを受け付けていません

SARI: Simplistic Average and Robust Identification based Noisy Partial Label Learning

投稿日: 2024年2月8日作成者: jarxiv

要約部分ラベル学習 (PLL) は、各トレーニングインスタンスが一連の候補ラ … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

Data-efficient Large Vision Models through Sequential Autoregression

投稿日: 2024年2月8日作成者: jarxiv

要約言語入力を避け、純粋に逐次的な視覚データに基づいて汎用視覚モデルをトレーニ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Fully Hyperbolic Convolutional Neural Networks for Computer Vision

投稿日: 2024年2月8日作成者: jarxiv

要約現実世界の視覚データは、双曲空間で効果的に表現できる固有の階層構造を示しま … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

Mixed Autoencoder for Self-supervised Visual Representation Learning

投稿日: 2024年2月8日作成者: jarxiv

要約 Masked Autoencoder (MAE) は、画像パッチをランダム … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

月別アーカイブ: 2024年2月

MLLM-as-a-Judge: Assessing Multimodal LLM-as-a-Judge with Vision-Language Benchmark

Mesh-based Gaussian Splatting for Real-time Large-scale Deformation

Spiking-PhysFormer: Camera-Based Remote Photoplethysmography with Parallel Spike-driven Transformer

iDeLog: Iterative Dual Spatial and Kinematic Extraction of Sigma-Lognormal Parameters

Domain Adaptation based Interpretable Image Emotion Recognition using Facial Expression Recognition

NeRF as Non-Distant Environment Emitter in Physics-based Inverse Rendering

SARI: Simplistic Average and Robust Identification based Noisy Partial Label Learning

Data-efficient Large Vision Models through Sequential Autoregression

Fully Hyperbolic Convolutional Neural Networks for Computer Vision

Mixed Autoencoder for Self-supervised Visual Representation Learning

最近の投稿

最近のコメント

アーカイブ

カテゴリー