Food Image Classification and Segmentation with Attention-based Multiple Instance Learning

要約

近年、食事モニタリングにおけるアプリケーションのニーズにより、正確な食品の定量化に対する需要が高まっています。
同時に、コンピュータービジョンのアプローチは、食品分野でのタスクの自動化において大きな可能性を示しています。
従来、これらの問題に対する機械学習モデルの開発は、ピクセルレベルのクラスアノテーションを備えたトレーニングデータセットに依存していました。
ただし、このアプローチでは、データ収集とグラウンドトゥルースの生成から生じる課題が発生します。データ収集とグラウンドトゥルースの生成は、複数の設定で数千のクラスに対して実行する必要があるため、すぐにコストがかかり、エラーが発生しやすくなります。
これらの課題を克服するために、この論文では、ピクセルレベルの注釈に依存せずに食品画像分類およびセマンティックセグメンテーションモデルをトレーニングするための弱教師あり方法論を紹介します。
提案された方法論は、注意ベースのメカニズムと組み合わせたマルチインスタンス学習アプローチに基づいています。
テスト時には、モデルが分類に使用され、同時にアテンションメカニズムが食品クラスのセグメンテーションに使用されるセマンティックヒートマップを生成します。
この論文では、提案されたアプローチの実現可能性を検証するために、FoodSeg103 データセット内の 2 つのメタクラスに対して実験を実施し、注意メカニズムの機能特性を調査します。

要約(オリジナル)

The demand for accurate food quantification has increased in the recent years, driven by the needs of applications in dietary monitoring. At the same time, computer vision approaches have exhibited great potential in automating tasks within the food domain. Traditionally, the development of machine learning models for these problems relies on training data sets with pixel-level class annotations. However, this approach introduces challenges arising from data collection and ground truth generation that quickly become costly and error-prone since they must be performed in multiple settings and for thousands of classes. To overcome these challenges, the paper presents a weakly supervised methodology for training food image classification and semantic segmentation models without relying on pixel-level annotations. The proposed methodology is based on a multiple instance learning approach in combination with an attention-based mechanism. At test time, the models are used for classification and, concurrently, the attention mechanism generates semantic heat maps which are used for food class segmentation. In the paper, we conduct experiments on two meta-classes within the FoodSeg103 data set to verify the feasibility of the proposed approach and we explore the functioning properties of the attention mechanism.

arxiv情報

著者	Valasia Vlachopoulou,Ioannis Sarafis,Alexandros Papadopoulos
発行日	2023-08-22 13:59:47+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Food Image Classification and Segmentation with Attention-based Multiple Instance Learning

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー