Vision Transformer Based Semantic Communications for Next Generation Wireless Networks

要約

6Gネットワークの進化する景観では、セマンティックコミュニケーションは、生データの精度よりもセマンティックな意味の送信に優先順位を付けることにより、データの伝送に革命をもたらす態勢が整っています。
このペーパーでは、帯域幅の需要を最小限に抑えながら、画像伝送中に高いセマンティックの類似性を実現するように意図的に設計されたビジョントランス（VIT）ベースのセマンティック通信フレームワークを紹介します。
VITをエンコーダデコーダーフレームワークに装備することにより、提案されたアーキテクチャは、画像をトランスミッタで高セマンティックコンテンツに専門的にエンコードし、レシーバーでの現実世界のフェードとノイズの配慮を考慮して、画像を正確に再構築できます。
VITSに固有の注意メカニズムに基づいて、私たちのモデルは、そのような画像を生成するために調整された畳み込みニューラルネットワーク（CNNS）と生成的敵対的ネットワーク（GAN）よりも優れています。
提案されたVITネットワークに基づくアーキテクチャは、38 dBのピーク信号対雑音比（PSNR）を達成します。これは、異なる通信環境でセマンティックな類似性を維持するために他のディープラーニング（DL）アプローチよりも高いです。
これらの調査結果は、セマンティックコミュニケーションの重要なブレークスルーとして、VITベースのアプローチを確立しています。

要約(オリジナル)

In the evolving landscape of 6G networks, semantic communications are poised to revolutionize data transmission by prioritizing the transmission of semantic meaning over raw data accuracy. This paper presents a Vision Transformer (ViT)-based semantic communication framework that has been deliberately designed to achieve high semantic similarity during image transmission while simultaneously minimizing the demand for bandwidth. By equipping ViT as the encoder-decoder framework, the proposed architecture can proficiently encode images into a high semantic content at the transmitter and precisely reconstruct the images, considering real-world fading and noise consideration at the receiver. Building on the attention mechanisms inherent to ViTs, our model outperforms Convolution Neural Network (CNNs) and Generative Adversarial Networks (GANs) tailored for generating such images. The architecture based on the proposed ViT network achieves the Peak Signal-to-noise Ratio (PSNR) of 38 dB, which is higher than other Deep Learning (DL) approaches in maintaining semantic similarity across different communication environments. These findings establish our ViT-based approach as a significant breakthrough in semantic communications.

arxiv情報

著者	Muhammad Ahmed Mohsin,Muhammad Jazib,Zeeshan Alam,Muhmmad Farhan Khan,Muhammad Saad,Muhammad Ali Jamshed
発行日	2025-03-21 16:23:02+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Vision Transformer Based Semantic Communications for Next Generation Wireless Networks

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー