arxiv_cs_lg 2026年2月10日

ポケットの中の大型言語モデルを理解する：商用オフザシェルフモバイルデバイスの性能評価

Understanding Large Language Models in Your Pockets: Performance Study on COTS Mobile Devices

Translated: 2026/3/15 9:03:02

large-language-modelsmobile-aimachine-learningcots-devicesllm-performance

Japanese Translation

arXiv:2410.03613v5 発表種別：置換摘要：大型言語モデル（LLM）が私たちの仕事や生活のあらゆる側面に組み込まれているにつれて、ユーザープライバシーに関する懸念が高まり、これらのモデルのローカルデプロイメントへのトレンドが進んでいます。スマートフォン上でローカルで実行可能ないくつかの軽量 LLM（例：Gemini Nano、LLAMA2 7B）が存在し、ユーザーが個人データへのアクセスをよりコントロールできるようになります。急速に新興するこの分野において、商用オフザシェルフ（COTS）モバイルデバイスでのパフォーマンスに懸念を抱いています。モバイルプラットフォーム上の LLM デプロイメントの現状を完全に理解するために、モバイルデバイスにおける包括的な測定研究を行いました。エンドユーザーはユーザー体験を主な関心事としますが、開発者は下流の実装に焦点を当てています。したがって、私たちはトークン透過速度、遅延、レスポンス品質といったユーザー中心の指標と、リソース利用率、OS ストランジェ、バッテリー消費、起動時間を含む開発者が重要と考える要因の両方を評価しました。また、大手ベンダーのモバイルシステムオンチップ（SoC）を跨ぎ包括的な比較を行いました。これは、LLM ワークロードの処理におけるパフォーマンスの差異を浮き彫りにし、開発者がモバイル LLM アプリケーションのボトルネックを特定および解決するのを助けると考えられます。本研究は、オンデバイス LLM の開発だけでなく、将来のモバイルシステムアーキテクチャの設計にも洞察を提供することを願っています。

Original Content

arXiv:2410.03613v5 Announce Type: replace Abstract: As large language models (LLMs) increasingly integrate into every aspect of our work and daily lives, there are growing concerns about user privacy, which push the trend toward local deployment of these models. There are a number of lightweight LLMs (e.g., Gemini Nano, LLAMA2 7B) that can run locally on smartphones, providing users with greater control over their personal data. As a rapidly emerging application, we are concerned about their performance on commercial-off-the-shelf mobile devices. To fully understand the current landscape of LLM deployment on mobile platforms, we conduct a comprehensive measurement study on mobile devices. While user experience is the primary concern for end-users, developers focus more on the underlying implementations. Therefore, we evaluate both user-centric metrics-such as token throughput, latency, and response quality-and developer-critical factors, including resource utilization, OS strategies, battery consumption, and launch time. We also provide comprehensive comparisons across the mobile system-on-chips (SoCs) from major vendors, highlighting their performance differences in handling LLM workloads, which may help developers identify and address bottlenecks for mobile LLM applications. We hope that this study can provide insights for both the development of on-device LLMs and the design for future mobile system architecture.