月別アーカイブ: 2023年6月

Grounding Language Models to Images for Multimodal Inputs and Outputs

投稿日: 2023年6月2日作成者: jarxiv

要約私たちは、事前学習済みのテキストのみの言語モデルを視覚領域に根付かせる効率 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.LG | コメントを受け付けていません

Spatio-Angular Convolutions for Super-resolution in Diffusion MRI

投稿日: 2023年6月2日作成者: jarxiv

要約拡散 MRI (dMRI) は広く使用されている画像診断モダリティですが、 … 続きを読む →

カテゴリー: cs.CV, eess.IV | コメントを受け付けていません

A deep-learning approach to early identification of suggested sexual harassment from videos

投稿日: 2023年6月2日作成者: jarxiv

要約セクハラ、性的虐待、性暴力は、今日の時代に蔓延している問題です。女性の安 … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

Dense and Aligned Captions (DAC) Promote Compositional Reasoning in VL Models

投稿日: 2023年6月2日作成者: jarxiv

要約ビジョンと言語 (VL) モデルは、画像とテキストの表現空間を調整するため … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Continual Vision-Language Representation Learning with Off-Diagonal Information

投稿日: 2023年6月2日作成者: jarxiv

要約 CLIP のような大規模なマルチモーダル対比学習フレームワークでは、通常、 … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

DeepFake-Adapter: Dual-Level Adapter for DeepFake Detection

投稿日: 2023年6月2日作成者: jarxiv

要約既存のディープフェイク検出方法は、目に見えないサンプルや劣化したサンプルに … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

A Transformer-based representation-learning model with unified processing of multimodal input for clinical diagnostics

投稿日: 2023年6月2日作成者: jarxiv

要約診断プロセス中、臨床医は主訴、医療画像、臨床検査結果などの複合情報を活用し … 続きを読む →

カテゴリー: cs.CL, cs.CV, cs.LG | コメントを受け付けていません

Quantifying Deep Learning Model Uncertainty in Conformal Prediction

投稿日: 2023年6月2日作成者: jarxiv

要約ディープニューラルネットワークにおける予測不確実性の正確な推定は、特に … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

VOCALExplore: Pay-as-You-Go Video Data Exploration and Model Building [Technical Report]

投稿日: 2023年6月2日作成者: jarxiv

要約ユーザーがビデオデータセットに対してドメイン固有のモデルを構築できるよう … 続きを読む →

カテゴリー: cs.CV, cs.DB, cs.SD, eess.AS | コメントを受け付けていません

LLaVA-Med: Training a Large Language-and-Vision Assistant for Biomedicine in One Day

投稿日: 2023年6月2日作成者: jarxiv

要約会話生成 AI は、生物医学従事者に力を与えるという顕著な可能性を示してい … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

月別アーカイブ: 2023年6月

Grounding Language Models to Images for Multimodal Inputs and Outputs

Spatio-Angular Convolutions for Super-resolution in Diffusion MRI

A deep-learning approach to early identification of suggested sexual harassment from videos

Dense and Aligned Captions (DAC) Promote Compositional Reasoning in VL Models

Continual Vision-Language Representation Learning with Off-Diagonal Information

DeepFake-Adapter: Dual-Level Adapter for DeepFake Detection

A Transformer-based representation-learning model with unified processing of multimodal input for clinical diagnostics

Quantifying Deep Learning Model Uncertainty in Conformal Prediction

VOCALExplore: Pay-as-You-Go Video Data Exploration and Model Building [Technical Report]

LLaVA-Med: Training a Large Language-and-Vision Assistant for Biomedicine in One Day

最近の投稿

最近のコメント

アーカイブ

カテゴリー