月別アーカイブ: 2024年9月

Squid: Long Context as a New Modality for Energy-Efficient On-Device Language Models

投稿日: 2024年9月4日作成者: jarxiv

要約本論文では、言語モデルにおける長いコンテキストのエネルギー効率的な処理のた … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

A Fundamental Trade-off in Aligned Language Models and its Relation to Sampling Adaptors

投稿日: 2024年9月4日作成者: jarxiv

要約人間が読み手として判断した文字列の品質と、言語モデルのもとでの確率$p(˶ … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

Improving Rare Word Translation With Dictionaries and Attention Masking

投稿日: 2024年9月4日作成者: jarxiv

要約機械翻訳では、希少語は、特に低リソースやドメイン外の翻訳環境において、主流 … 続きを読む →

カテゴリー: cs.CL, cs.LG | コメントを受け付けていません

Mamba3D: Enhancing Local Features for 3D Point Cloud Analysis via State Space Model

投稿日: 2024年9月4日作成者: jarxiv

要約点群解析のための既存のTransformerベースのモデルは、2次関数的な … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

An Effective Information Theoretic Framework for Channel Pruning

投稿日: 2024年9月4日作成者: jarxiv

要約チャンネル刈り込みは、畳み込みニューラルネットワークを高速化・圧縮するため … 続きを読む →

カテゴリー: cs.AI, cs.IT, cs.LG, math.IT | コメントを受け付けていません

Stabilizing Extreme Q-learning by Maclaurin Expansion

投稿日: 2024年9月4日作成者: jarxiv

要約オフライン強化学習では、データセットから分布外の行動を評価することによる性 … 続きを読む →

カテゴリー: cs.AI, cs.LG | コメントを受け付けていません

The Cultivated Practices of Text-to-Image Generation

投稿日: 2024年9月4日作成者: jarxiv

要約人類は、生成人工知能（AI）を使って誰でもデジタル情報を合成できる、斬新な … 続きを読む →

カテゴリー: cs.AI, cs.CY, I.2.0 | コメントを受け付けていません

Sentiment Analysis Across Languages: Evaluation Before and After Machine Translation to English

投稿日: 2024年9月4日作成者: jarxiv

要約人々は世界中で7,000以上の言語でコミュニケーションしており、インドだけ … 続きを読む →

カテゴリー: cs.AI, cs.CL | コメントを受け付けていません

Domain-Specific Improvement on Psychotherapy Chatbot Using Assistant

投稿日: 2024年9月4日作成者: jarxiv

要約大規模言語モデル(LLM)は、人間が書いた指示データを用いた特定のタスクに … 続きを読む →

カテゴリー: cs.AI, cs.CL | コメントを受け付けていません

An Investigation of Neuron Activation as a Unified Lens to Explain Chain-of-Thought Eliciting Arithmetic Reasoning of LLMs

投稿日: 2024年9月4日作成者: jarxiv

要約大規模言語モデル(LLM)は、Chain-of-Thought(CoT)プ … 続きを読む →

カテゴリー: cs.AI, I.2.7 | コメントを受け付けていません

月別アーカイブ: 2024年9月

Squid: Long Context as a New Modality for Energy-Efficient On-Device Language Models

A Fundamental Trade-off in Aligned Language Models and its Relation to Sampling Adaptors

Improving Rare Word Translation With Dictionaries and Attention Masking

Mamba3D: Enhancing Local Features for 3D Point Cloud Analysis via State Space Model

An Effective Information Theoretic Framework for Channel Pruning

Stabilizing Extreme Q-learning by Maclaurin Expansion

The Cultivated Practices of Text-to-Image Generation

Sentiment Analysis Across Languages: Evaluation Before and After Machine Translation to English

Domain-Specific Improvement on Psychotherapy Chatbot Using Assistant

An Investigation of Neuron Activation as a Unified Lens to Explain Chain-of-Thought Eliciting Arithmetic Reasoning of LLMs

最近の投稿

最近のコメント

アーカイブ

カテゴリー