The Impact of Inference Acceleration Strategies on Bias of LLMs

要約

ここ数年、大規模言語モデル (LLM) の機能は前例のない進歩を遂げています。
これらの進歩は、膨大な数のアプリケーションドメインに大きな利益をもたらすことを約束します。
ただし、LLM のサイズが非常に大きいため、LLM を使用して推論を実行するとコストがかかり、時間がかかります。
その結果、最近の多くの研究では、量子化、枝刈り、キャッシュなど、推論効率を向上させる戦略が提案されています。
これらの高速化戦略は、一般的なベンチマークで測定された予測パフォーマンスの多くを維持しながら、推論コストと遅延を多くの場合いくつかの要因で削減します。
この研究では、LLM パフォーマンスのもう 1 つの重要な側面、つまり推論高速化の最適化によるモデル生成における人口統計の偏りを調査します。
幅広いメトリクスを使用して、さまざまな角度からモデル出力のバイアスを調査します。
推論加速の前後で出力を分析すると、バイアスに大きな変化が見られます。
憂慮すべきことに、こうしたバイアスの影響は複雑で予測不可能です。
加速戦略とバイアスの種類を組み合わせると、あるモデルではバイアスの変化がほとんど見られない場合でも、別のモデルでは大きな効果が生じる可能性があります。
私たちの結果は、推論を高速化するためにモデルのバイアスを修正した後、モデルのバイアスを詳細かつケースバイケースで評価する必要があることを浮き彫りにしています。

要約(オリジナル)

Last few years have seen unprecedented advances in capabilities of Large Language Models (LLMs). These advancements promise to deeply benefit a vast array of application domains. However, due to their immense size, performing inference with LLMs is both costly and slow. Consequently, a plethora of recent work has proposed strategies to enhance inference efficiency, e.g., quantization, pruning, and caching. These acceleration strategies reduce the inference cost and latency, often by several factors, while maintaining much of the predictive performance measured via common benchmarks. In this work, we explore another critical aspect of LLM performance: demographic bias in model generations due to inference acceleration optimizations. Using a wide range of metrics, we probe bias in model outputs from a number of angles. Analysis of outputs before and after inference acceleration shows significant change in bias. Worryingly, these bias effects are complex and unpredictable. A combination of an acceleration strategy and bias type may show little bias change in one model but may lead to a large effect in another. Our results highlight a need for in-depth and case-by-case evaluation of model bias after it has been modified to accelerate inference.

arxiv情報

著者	Elisabeth Kirsten,Ivan Habernal,Vedant Nanda,Muhammad Bilal Zafar
発行日	2024-10-29 15:19:13+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

The Impact of Inference Acceleration Strategies on Bias of LLMs

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー