投稿者「jarxiv」のアーカイブ

Rad4XCNN: a new agnostic method for post-hoc global explanation of CNN-derived features by means of radiomics

投稿日: 2025年1月9日作成者: jarxiv

要約近年、機械学習ベースの臨床意思決定支援システム (CDSS) が、いくつか … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

Motion-Zero: Zero-Shot Moving Object Control Framework for Diffusion-Based Video Generation

投稿日: 2025年1月9日作成者: jarxiv

要約最近の大規模な事前トレーニング済み拡散モデルは、詳細なテキストの説明から高 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

OpenOmni: Large Language Models Pivot Zero-shot Omnimodal Alignment across Language with Real-time Self-Aware Emotional Speech Synthesis

投稿日: 2025年1月9日作成者: jarxiv

要約オムニモーダル学習の最近の進歩は、主に独自のモデル内ではあるものの、画像、 … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

Learnable Scaled Gradient Descent for Guaranteed Robust Tensor PCA

投稿日: 2025年1月9日作成者: jarxiv

要約ロバストテンソル主成分分析 (RTPCA) は、多次元データから低ランク成 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Supervision-free Vision-Language Alignment

投稿日: 2025年1月9日作成者: jarxiv

要約視覚言語モデル (VLM) は、視覚情報と言語情報の統合において顕著な可能 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.LG | コメントを受け付けていません

PointDreamer: Zero-shot 3D Textured Mesh Reconstruction from Colored Point Cloud

投稿日: 2025年1月9日作成者: jarxiv

要約カラー点群からテクスチャメッシュを再構築することは重要ですが、困難な作業 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Unified Coding for Both Human Perception and Generalized Machine Analytics with CLIP Supervision

投稿日: 2025年1月9日作成者: jarxiv

要約デコードされたビットストリームは通常、人間またはマシンのニーズにのみ対応し … 続きを読む →

カテゴリー: cs.CV, cs.MM | コメントを受け付けていません

Towards Revisiting Visual Place Recognition for Joining Submaps in Multimap SLAM

投稿日: 2025年1月9日作成者: jarxiv

要約 Visual SLAM は、多くの自律システムにとって重要なテクノロジーで … 続きを読む →

カテゴリー: cs.CV, cs.RO | コメントを受け付けていません

Boosting Salient Object Detection with Knowledge Distillated from Large Foundation Models

投稿日: 2025年1月9日作成者: jarxiv

要約 Salient Object Detection (SOD) は、シーン内 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Identity-Preserving Video Dubbing Using Motion Warping

投稿日: 2025年1月9日作成者: jarxiv

要約ビデオダビングは、リファレンスビデオと運転音声信号からリアルなリップシンク … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

投稿者「jarxiv」のアーカイブ

Rad4XCNN: a new agnostic method for post-hoc global explanation of CNN-derived features by means of radiomics

Motion-Zero: Zero-Shot Moving Object Control Framework for Diffusion-Based Video Generation

OpenOmni: Large Language Models Pivot Zero-shot Omnimodal Alignment across Language with Real-time Self-Aware Emotional Speech Synthesis

Learnable Scaled Gradient Descent for Guaranteed Robust Tensor PCA

Supervision-free Vision-Language Alignment

PointDreamer: Zero-shot 3D Textured Mesh Reconstruction from Colored Point Cloud

Unified Coding for Both Human Perception and Generalized Machine Analytics with CLIP Supervision

Towards Revisiting Visual Place Recognition for Joining Submaps in Multimap SLAM

Boosting Salient Object Detection with Knowledge Distillated from Large Foundation Models

Identity-Preserving Video Dubbing Using Motion Warping

最近の投稿

最近のコメント

アーカイブ

カテゴリー