Navigating the Post-API Dilemma | Search Engine Results Pages Present a Biased View of Social Media Data

要約

ソーシャルメディア API へのアクセスを停止するという最近の決定は、インターネット研究と計算社会科学の分野全体に悪影響を及ぼしています。
このデータへのアクセスの欠如は、インターネット調査のポスト API 時代と呼ばれています。
幸いなことに、一般的な検索エンジンには、適切な検索クエリが提供されれば、検索エンジン結果ページ (SERP) 上でソーシャルメディアデータをクロール、キャプチャ、表示する手段があり、このジレンマの解決策となる可能性があります。
現在の研究では、SERP はソーシャルメディアデータの完全で偏りのないサンプルを提供しているかどうかを尋ねます。
SERP は直接 API アクセスに代わる実行可能な手段ですか?
これらの質問に答えるために、(Google) SERP 結果と Reddit および Twitter/X からの非サンプリングデータとの間の比較分析を実行します。
SERP の結果は、人気のある投稿に非常に偏っていることがわかりました。
政治的、ポルノ的、下品な投稿に対する反対。
感情がよりポジティブになります。
話題のギャップが大きい。
全体として、SERP はソーシャルメディア API アクセスの実行可能な代替手段ではないと結論付けています。

要約(オリジナル)

Recent decisions to discontinue access to social media APIs are having detrimental effects on Internet research and the field of computational social science as a whole. This lack of access to data has been dubbed the Post-API era of Internet research. Fortunately, popular search engines have the means to crawl, capture, and surface social media data on their Search Engine Results Pages (SERP) if provided the proper search query, and may provide a solution to this dilemma. In the present work we ask: does SERP provide a complete and unbiased sample of social media data? Is SERP a viable alternative to direct API-access? To answer these questions, we perform a comparative analysis between (Google) SERP results and nonsampled data from Reddit and Twitter/X. We find that SERP results are highly biased in favor of popular posts; against political, pornographic, and vulgar posts; are more positive in their sentiment; and have large topical gaps. Overall, we conclude that SERP is not a viable alternative to social media API access.

arxiv情報

著者	Amrit Poudel,Tim Weninger
発行日	2024-04-02 15:28:07+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Navigating the Post-API Dilemma | Search Engine Results Pages Present a Biased View of Social Media Data

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー