投稿者「jarxiv」のアーカイブ

Delving into RL for Image Generation with CoT: A Study on DPO vs. GRPO

投稿日: 2025年6月11日作成者: jarxiv

要約最近の進歩は、大規模な言語モデル（LLM）の考え方（COT）の推論能力を高 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.LG | コメントを受け付けていません

HiSin: Efficient High-Resolution Sinogram Inpainting via Resolution-Guided Progressive Inference

投稿日: 2025年6月11日作成者: jarxiv

要約高解像度のシノグラムの開始は、高周波投影が見られないと目に見えるアーティフ … 続きを読む →

カテゴリー: cs.CV, eess.IV | コメントを受け付けていません

Video-CoT: A Comprehensive Dataset for Spatiotemporal Understanding of Videos Based on Chain-of-Thought

投稿日: 2025年6月11日作成者: jarxiv

要約ビデオ分析からインタラクティブなシステムに至るまで、ビデオコンテンツの理解 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

CulturalFrames: Assessing Cultural Expectation Alignment in Text-to-Image Models and Evaluation Metrics

投稿日: 2025年6月11日作成者: jarxiv

要約視覚コンテンツの生成のツールとしてのテキストからイメージ（T2I）モデルの … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

TinyLLaVA-Video: Towards Smaller LMMs for Video Understanding with Group Resampler

投稿日: 2025年6月11日作成者: jarxiv

要約ビデオ行動の認識とシーンの理解は、マルチモーダルインテリジェンスの基本的な … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Adapting Vision-Language Foundation Model for Next Generation Medical Ultrasound Image Analysis

投稿日: 2025年6月11日作成者: jarxiv

要約医学的超音波検査は、リンパ節、乳房、甲状腺などの表在臓器や組織を調べるため … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

StereoVAE: A lightweight stereo-matching system using embedded GPUs

投稿日: 2025年6月11日作成者: jarxiv

要約組み込みGPUを介してステレオマッチング用の軽量システムを提示します。ス … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.MM, cs.RO | コメントを受け付けていません

Mitigating Prior Shape Bias in Point Clouds via Differentiable Center Learning

投稿日: 2025年6月11日作成者: jarxiv

要約マスクされた自動エンコードと生成前削除は、コンピュータービジョンと自然言語 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Spatial Transcriptomics Expression Prediction from Histopathology Based on Cross-Modal Mask Reconstruction and Contrastive Learning

投稿日: 2025年6月11日作成者: jarxiv

要約空間トランスクリプトミクスは、さまざまな空間的位置で遺伝子発現レベルをキャ … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

StreamSplat: Towards Online Dynamic 3D Reconstruction from Uncalibrated Video Streams

投稿日: 2025年6月11日作成者: jarxiv

要約非調整されたビデオストリームからの動的3Dシーンのリアルタイム再構成は、多 … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

投稿者「jarxiv」のアーカイブ

Delving into RL for Image Generation with CoT: A Study on DPO vs. GRPO

HiSin: Efficient High-Resolution Sinogram Inpainting via Resolution-Guided Progressive Inference

Video-CoT: A Comprehensive Dataset for Spatiotemporal Understanding of Videos Based on Chain-of-Thought

CulturalFrames: Assessing Cultural Expectation Alignment in Text-to-Image Models and Evaluation Metrics

TinyLLaVA-Video: Towards Smaller LMMs for Video Understanding with Group Resampler

Adapting Vision-Language Foundation Model for Next Generation Medical Ultrasound Image Analysis

StereoVAE: A lightweight stereo-matching system using embedded GPUs

Mitigating Prior Shape Bias in Point Clouds via Differentiable Center Learning

Spatial Transcriptomics Expression Prediction from Histopathology Based on Cross-Modal Mask Reconstruction and Contrastive Learning

StreamSplat: Towards Online Dynamic 3D Reconstruction from Uncalibrated Video Streams

最近の投稿

最近のコメント

アーカイブ

カテゴリー