arxiv_cs_ai 2026年4月20日

PennyLane 中心のデータセット：RAG を活用した LLM ベースの量子コード生成の強化

A PennyLane-Centric Dataset to Enhance LLM-based Quantum Code Generation using RAG

Translated: 2026/4/20 11:17:18

quantum-computinglarge-language-modelsretrieval-augmented-generationcode-generationpennylane

Japanese Translation

arXiv:2503.02497v4 発表タイプ：replace-cross 要約：大規模言語モデル（LLM）は、コード生成、自然言語理解、およびドメイン固有の推論において強力な能力を持っています。しかし、その応用が量子ソフトウェア開発においてまだ限定的であるのは、LLM 訓練用の高品質データセットおよび信頼できる知識源としてのデータセットが不足しているためである部分です。このギャップを埋めるために、私たちは教科書、公式ドキュメント、オープンソースリポジトリから収集したコンテキスト付き説明を付けた、3,347 件の PennyLane 固有の量子コードサンプルを含む、オフザシェルフの高品質データセットである「PennyLang」を導入しました。当社の貢献は主に 3 つに及ぶ：（1）量子プログラミングに特化されたデータセットである PennyLang の作成とオープンソースリリース；（2）キュレーション、注釈付け、フォーマット化を体系化し、下流の LLM 利用可能性を最大化するための自動化された量子コードデータセット構築フレームワーク；（3）複数のオープンソースおよび商用モデルを含む、検索拡張生成（RAG）パイプライン内で実施された消融実験を伴うデータセットのベンチマーク評価です。PennyLang を RAG と組み合わせることで、パフォーマンスが著しく向上します：例えば、Qwen 7B の成功率は検索なしで 8.7% から、コンテキスト完全な拡張で 41.7% へと上昇し、LLaMa 4 は 78.8% から 84.8% へと改善されます。これに加えて、幻覚が抑えられ、量子コードの正しさが向上します。Qiskit 中心の研究を超越し、私たちは PennyLane における AI 支援型量子開発の進展のために、LLM ベースのツールと再現可能な方法を導入します。

Original Content

arXiv:2503.02497v4 Announce Type: replace-cross Abstract: Large Language Models (LLMs) offer powerful capabilities in code generation, natural language understanding, and domain-specific reasoning. Their application to quantum software development remains limited, in part because of the lack of high-quality datasets both for LLM training and as dependable knowledge sources. To bridge this gap, we introduce \textit{PennyLang}, an off-the-shelf, high-quality dataset of 3,347 PennyLane-specific quantum code samples with contextual descriptions, curated from textbooks, official documentation, and open-source repositories. Our contributions are threefold: (1) the creation and open-source release of PennyLang, a purpose-built dataset for quantum programming with PennyLane; (2) a framework for automated quantum code dataset construction that systematizes curation, annotation, and formatting to maximize downstream LLM usability; and (3) a baseline evaluation of the dataset across multiple open-source and commercial models, including ablation studies, all conducted within a retrieval-augmented generation (RAG) pipeline. Using PennyLang with RAG substantially improves performance: for example, Qwen 7B's success rate rises from 8.7% without retrieval to 41.7% with full-context augmentation, and LLaMa 4 improves from 78.8% to 84.8%, while also reducing hallucinations and enhancing quantum code correctness. Moving beyond Qiskit-focused studies, we bring LLM-based tools and reproducible methods to PennyLane for advancing AI-assisted quantum development.