投稿者「jarxiv」のアーカイブ

GPTQv2: Efficient Finetuning-Free Quantization for Asymmetric Calibration

投稿日: 2025年4月7日作成者: jarxiv

要約 GPTQv2は、大規模変換器アーキテクチャを圧縮するための新しい微調整不要 … 続きを読む →

カテゴリー: cs.LG | コメントを受け付けていません

Ichigo: Mixed-Modal Early-Fusion Realtime Voice Assistant

投稿日: 2025年4月7日作成者: jarxiv

要約大規模言語モデル(LLM)は自然言語処理に革命をもたらしたが、音声とテキス … 続きを読む →

カテゴリー: cs.CL, cs.SD, eess.AS | コメントを受け付けていません

Why do LLMs attend to the first token?

投稿日: 2025年4月7日作成者: jarxiv

要約大規模言語モデル(LLM)は、シーケンスの最初のトークンに集中する傾向があ … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

A Survey of Large Language Models in Mental Health Disorder Detection on Social Media

投稿日: 2025年4月7日作成者: jarxiv

要約メンタルヘルス問題の検出と介入は、世界的に重要な研究テーマであり、ソーシャ … 続きを読む →

カテゴリー: cs.CL, I.2.7 | コメントを受け付けていません

RBT4DNN: Requirements-based Testing of Neural Networks

投稿日: 2025年4月7日作成者: jarxiv

要約ディープニューラルネットワーク（DNN）のテストは、障害が重大な結果をもた … 続きを読む →

カテゴリー: cs.AI, cs.LG, cs.SE | コメントを受け付けていません

Audio-visual Controlled Video Diffusion with Masked Selective State Spaces Modeling for Natural Talking Head Generation

投稿日: 2025年4月7日作成者: jarxiv

要約トーキングヘッド合成は、バーチャルアバターや人間とコンピュータのインタラク … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Retrieving Semantics from the Deep: an RAG Solution for Gesture Synthesis

投稿日: 2025年4月7日作成者: jarxiv

要約非言語的コミュニケーションは、発話の意味を伝えるのに役立つ意味豊かなジェス … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Quattro: Transformer-Accelerated Iterative Linear Quadratic Regulator Framework for Fast Trajectory Optimization

投稿日: 2025年4月7日作成者: jarxiv

要約リアルタイム最適制御は、ロボット工学の基本的な課題である。代表的な軌道最適 … 続きを読む →

カテゴリー: cs.RO, cs.SY, eess.SY | コメントを受け付けていません

Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme

投稿日: 2025年4月7日作成者: jarxiv

要約強化学習(RL)は、近年、大規模言語モデルの推論能力を向上させる強い可能性 … 続きを読む →

カテゴリー: cs.CL, cs.CV, cs.LG | コメントを受け付けていません

Computing High-dimensional Confidence Sets for Arbitrary Distributions

投稿日: 2025年4月4日作成者: jarxiv

要約 mathbb{R}^d$上の任意の分布の高密度領域を学習する問題を研究する … 続きを読む →

カテゴリー: cs.DS, cs.LG, math.ST, stat.ML, stat.TH | コメントを受け付けていません

投稿者「jarxiv」のアーカイブ

GPTQv2: Efficient Finetuning-Free Quantization for Asymmetric Calibration

Ichigo: Mixed-Modal Early-Fusion Realtime Voice Assistant

Why do LLMs attend to the first token?

A Survey of Large Language Models in Mental Health Disorder Detection on Social Media

RBT4DNN: Requirements-based Testing of Neural Networks

Audio-visual Controlled Video Diffusion with Masked Selective State Spaces Modeling for Natural Talking Head Generation

Retrieving Semantics from the Deep: an RAG Solution for Gesture Synthesis

Quattro: Transformer-Accelerated Iterative Linear Quadratic Regulator Framework for Fast Trajectory Optimization

Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme

Computing High-dimensional Confidence Sets for Arbitrary Distributions

最近の投稿

最近のコメント

アーカイブ

カテゴリー