dev_to 2026年4月20日

ゲームの QA ベストプラクティス：開発者のテストガイド

Game QA Best Practices: A Developer''s Testing Guide

Translated: 2026/4/20 12:01:52

game-quality-assurancesoftware-testingperformance-testingregression-testinggame-dev-practices

Japanese Translation

ゲーム開発における品質保証（QA）は、プロジェクトの最後に追加するフェーズではありません。それはプロトタイプからライブ運用に至るまで、すべてのスプラント、機能ブランチ、マイルストーンに並行して行われる実践です。QA を後から付け加えるような対応を取れば、リリース前の数週間で、数ヶ月前に捉えるべきはずだったクラッシュ、レグレス（戻り不具合）、パフォーマンスの問題に火消しをする羽目になります。 Ocean View Games は、厳しい QA 要件を備えたプロジェクトを多数リリースしてきました。RuneScape Mobile は、20 年にわたるコンテンツを正当化しつつ、数千の Android デバイス構成で一貫して動作させる必要がありますでした。Domi Online は、1,000 人以上の同時プレイヤーを維持する FishNet サーバ中心の MMO であり、本物のネットワーク環境下에서만現れるディスコン（同期不整合）問題、競合条件、エッジケースを発見するために厳格なマルチプレイヤーテストが必要です。この二つのプロジェクトから得られた教訓は、良い QA は英雄気質ではなく体系的であるということです。本ガイドは、我々が日常使用しているテストプラクティスを説明します。それは、抽象的な理論ではなく実用的なフレームワークを求めている開発者やスタジオリーダーを対象としています。ゲームの QA は複数の分野をカバーします。それぞれが異なるリスクのカテゴリーを対象としており、いずれかを欠くことはプレイヤーが見つかる隙間を残します。基盤。各機能が意図通りに機能しているか？機能テストは、コアなゲームプレイループ、UI インタラクション、セーブ/ローダースイステム、IAP フロー、およびあらゆるユーザー向け挙動をカバーします。ハッピーパスだけでなくエッジケースも動かすテストケースが必要です：プレイヤーが買いボタンを連打した場合、取引中に断線した場合、またはロード画面中にデバイスを回転させた場合、それぞれはどうなるか。すべてのバグ修正や新機能は、他のものを壊す可能性があります。レグレステストは既存の機能をテストケースで再実行して、コード変更後もその機能が動作していることを確認します。ここが自動化が自己負担を償う場所です。人間が 2 日間かけて実行するレグレススイートは、自動化されたパスでは数分で完了します。フレームレート、メモリ消費量、読み込み時間、バッテリー消費、熱的挙動などはすべてパフォーマンステストの範疇に入ります。このカテゴリーは、特にハードウェアが厳しい熱制限とメモリ制限を持つモバイルゲームにおいて重要であることに注意してください。フラッグシップスマートフォンで美しく動作するゲームであっても、目標オーディエンスの 40% を占めるミドルレンジのデバイスの上でステータする可能性があります。あなたのゲームは、デバイスの範囲、OS のバージョン、画面サイズ、アスペクト比で動作する必要があるのです。互換性テストはデバイス固有の問題を特定します：Adreno GPU 上で失敗するシェーダー、ウルトラワイド画面で重なる UI エレメント、特定の Samsung モデルでのオーディオドライバの不具合など。オンラインゲームの分野では、現実的なネットワーク条件下でのテストが必要です。これは、遅延、パケットロス、断線をシミュレートすることを意味します。また、期待ピークを等しく上回る同時プレイヤー数を抱えたストレステストも意味します。Domi Online では、ローカル開発環境では決して発生しない条件下で、サーバーが 500 以上の同時接続を処理しているときに現れる数つのディスコンバグを特定しました。すべてのデバイスでテストすることはできません。しかし、プレイヤーの大部分をカバーするように慎重に選択されたサブセット上でテストすることは可能です。目標は、広範なカバレッジと現実的な制約とのバランスを構築することです。まずは市場シェアデータから始めます。Google Play コンソールと App Annie（現在は data.ai）は、目標地域の最も一般的なデバイス、チップセット、OS バージョンの分解を提供します。そのデータから、3 つのパフォーマンスティアに跨るデバイスを選択します：ティア説明 Low 2〜3GB RAM、古い Mali または Adreno GPU を持つ予算デバイス。これはあなたの最小仕様を表し、最も多くのパフォーマンス問題を表面化させます。 Mid 現在のミドルレンジチップセットを有する 4〜6GB RAM のデバイス。ここがオーディエンスの大部分が存在します。 High フラッグシップデバイス。これらは、ゲームがより良いハードウェアを効果的に活用でき、高フレームレートまたは高分解能で機能の崩壊がないことを確認します。少なくとも 1 つのデバイスを包括してください。

Original Content

Quality assurance in game development is not a phase you bolt on at the end. It is a discipline that runs alongside every sprint, every feature branch, and every milestone from prototype through to live operations. Treat QA as an afterthought and you will spend your final weeks before launch firefighting crashes, regressions, and performance issues that should have been caught months earlier. At Ocean View Games, we have shipped projects with demanding QA requirements. RuneScape Mobile needed to run consistently across thousands of Android device configurations while maintaining a frame rate that did justice to a game with two decades of content. Domi Online, an MMO with 1,000+ concurrent players on a FishNet server-authoritative architecture, required rigorous multiplayer testing to catch desync issues, race conditions, and edge cases that only surface under real network conditions. Both projects taught us that good QA is systematic, not heroic. This guide covers the testing practices we use day to day. It is aimed at developers and studio leads who want a practical framework rather than abstract theory. Game QA spans several disciplines. Each one targets a different category of risk, and skipping any of them leaves gaps that players will find for you. The foundation. Does each feature work as intended? Functional testing covers core gameplay loops, UI interactions, save/load systems, IAP flows, and every user-facing behaviour. You need test cases that exercise both the happy path and the edge cases: what happens when a player spams the buy button, disconnects mid-transaction, or rotates the device during a loading screen? Every bug fix and new feature has the potential to break something else. Regression testing re-runs previous test cases to confirm that existing functionality still works after a code change. This is where automation pays for itself. A regression suite that takes a human two days to execute can run in minutes as an automated pass. Frame rate, memory consumption, load times, battery drain, and thermal behaviour all fall under performance testing. This category is especially critical for mobile games, where you are running on hardware with strict thermal and memory limits. A game that runs beautifully on a flagship phone may stutter on a mid-range device that represents 40% of your target audience. Your game needs to work across a range of devices, OS versions, screen sizes, and aspect ratios. Compatibility testing identifies device-specific issues: a shader that fails on Adreno GPUs, a UI element that overlaps on ultra-wide screens, or an audio driver bug on a specific Samsung model. For online games, you need to test under realistic network conditions. This means simulating latency, packet loss, and disconnections. It also means stress testing with concurrent players at or above your expected peak. In Domi Online, we caught several desync bugs that only appeared when the server was handling 500+ simultaneous connections, conditions that never occur in a local dev environment. You cannot test on every device. You can, however, test on a carefully chosen subset that covers the vast majority of your players. The goal is to build a device matrix that balances breadth of coverage against practical constraints. Start with market share data. Google Play Console and App Annie (now data.ai) provide breakdowns of the most common devices, chipsets, and OS versions in your target regions. From that data, select devices across three performance tiers: Tier Description Low Budget devices with 2-3 GB RAM, older Mali or Adreno GPUs. These represent your minimum spec and will surface the most performance issues. Mid Devices in the 4-6 GB RAM range with current-generation mid-range chipsets. This is where the bulk of your audience sits. High Flagship devices. These confirm that your game takes advantage of better hardware and that no features break at high frame rates or resolutions. Include at least one device from each major manufacturer (Samsung, Xiaomi, Google Pixel, Huawei) and cover both ARM Mali and Qualcomm Adreno GPU families. For iOS, test on the oldest supported iPhone model, a current mid-range (iPhone SE), and the latest flagship. Physical devices give the most accurate results, but cloud device farms extend your coverage without the hardware cost. Firebase Test Lab, AWS Device Farm, and BrowserStack all offer access to hundreds of real devices. Use them for automated compatibility passes and to reproduce bugs reported on devices you do not own. A minimum viable device matrix for a typical mobile game is five to seven devices: two low-tier, two mid-tier, one high-tier Android, and two iOS devices. That will cover roughly 70% of the configurations your players are likely to use. Automation and manual testing serve different purposes. The mistake is trying to automate everything or refusing to automate anything. Automate the repetitive, deterministic checks that need to run on every build: Test Type Purpose Smoke tests Does the game launch? Does it reach the main menu without crashing? Can it complete a basic gameplay loop? Build verification Automated checks that confirm the build compiles, asset bundles load, and critical systems initialise without errors. Crash detection Integrate crash reporting (Firebase Crashlytics or Backtrace) from day one so that every crash is logged with a stack trace, device info, and reproduction context. Unit and integration tests The Unity Test Framework supports both Edit Mode and Play Mode tests. Use Edit Mode tests for pure logic (damage calculations, inventory management, state machines) and Play Mode tests for systems that depend on MonoBehaviour lifecycle or scene loading. Some things require human judgement. Gameplay feel, animation timing, camera behaviour, tutorial clarity, and visual polish are all subjective and context-dependent. A human tester can tell you that a jump feels floaty or that a menu transition is jarring. An automated test cannot. UX flow testing also benefits from fresh eyes. Bring in testers who have not seen the game before to walk through onboarding, monetisation prompts, and settings menus. Their confusion is data. The sweet spot is a pipeline where automated tests gate every build and manual testing focuses on areas where human perception adds value. Performance issues are the most common reason mobile games receive negative reviews. A structured profiling workflow catches these problems before your players do. The Unity Editor is not a valid profiling target. It adds overhead from the Editor UI, uses your development machine's CPU and GPU, and does not replicate the thermal constraints of a mobile device. Always profile on real hardware. Connect a device via USB, enable Development Build and Autoconnect Profiler in Build Settings, and use the Unity Profiler to capture frames. For deeper GPU analysis, use platform-specific tools: Xcode Instruments for iOS, Android GPU Inspector for Android, and RenderDoc for desktop. Focus on these metrics during every profiling session: Metric What to look for Frame time Target 16.67ms for 60 FPS or 33.33ms for 30 FPS. Look at the worst-case frames, not the average. A game that averages 14ms but spikes to 40ms every few seconds will feel worse than one that holds a steady 30ms. Memory allocation GC allocations during gameplay cause frame hitches. Use the Profiler's GC Alloc column to find per-frame allocations and eliminate them. Object pooling, pre-allocated lists, and avoiding LINQ in hot paths are standard fixes. Draw calls and batching Monitor the Rendering Profiler for draw call counts. On mobile, aim to keep draw calls under 100-150 per frame. Use GPU instancing, sprite atlases, and SRP Batcher to reduce them. Thermal state Mobile devices throttle CPU and GPU frequencies when they overheat. A game that runs well for the first five minutes but drops to 15 FPS after ten minutes has a thermal problem. Monitor thermal state via the Android ThermalStatusListener or iOS ProcessInfo thermalState APIs. For a comprehensive profiling methodology, see our Unity mobile optimisation checklist. If you need hands-on help, our performance optimisation services cover profiling, bottleneck resolution, and sustained performance tuning. Finding bugs is only useful if you document them well enough for someone to fix them. A vague report wastes developer time. A precise one saves it. Every bug report should contain: Field What to include Title A concise description of the problem. "Crash on level 3" is too vague. "Crash when opening inventory with 50+ items on Mali GPU devices" is actionable. Steps to reproduce Numbered steps that reliably trigger the issue. If the bug is intermittent, note the reproduction rate (e.g., "occurs roughly 1 in 5 attempts"). Expected vs actual behaviour What should happen and what actually happens. Environment Device model, OS version, game version, build number. Evidence Screenshots, screen recordings, or log files. For crashes, include the stack trace. We use a four-level severity system: Severity Definition S1 (Critical) Crashes, data loss, or issues that block progression. These are release blockers and must be fixed before the next build ships. S2 (High) Major functionality broken but a workaround exists. These should be fixed in the current sprint. S3 (Medium) Minor issues that affect the experience but do not break functionality. Scheduled for the next available sprint. S4 (Low) Cosmetic issues, minor text errors, or edge cases with negligible impact. Fixed when convenient. Triage meetings should happen at least twice a week during active development and daily during the final weeks before launch. The goal is to review new bugs, assign severity, and ensure S1 and S2 issues are never left unassigned. The weeks before submission are when QA intensity peaks. A structured checklist prevents last-minute surprises. Both Apple and Google have submission requirements that can cause rejection if missed: Privacy policy URL is set and accessible. App permissions are justified in the store listing (camera, microphone, location). Age rating questionnaire is completed accurately. IAP products are configured and tested in sandbox environments. IDFA/ATT prompt is implemented correctly on iOS. Data safety section is filled in on Google Play. Screenshots and promotional assets meet dimension and content requirements. Run a full regression pass on your device matrix. Confirm that crash-free rates are above 99.5% over the last 1,000 sessions. Verify that analytics events fire correctly and that deep links resolve to the right screens. For a comprehensive pre-submission walkthrough, see our platform readiness guide and our platform readiness checklist. If you need support with the submission process itself, our app store launch services cover everything from store listing optimisation to post-launch monitoring. Good QA is a system, not a heroic effort. Build the framework early, automate what you can, profile on real hardware, and triage ruthlessly. The bugs that hurt your launch are rarely the ones nobody found. They are the ones that were found, reported poorly, and deprioritised without enough context. At Ocean View Games, QA is embedded into every project from kickoff. If you are looking for a development partner that treats testing as a core discipline rather than a final hurdle, take a look at our QA testing services or get in touch to discuss your project. Written by David Edgecombe, Unity Certified Expert and Technical Lead at Ocean View Games. David previously served as Mobile Team Lead at Jagex, where he led the team responsible for RuneScape Mobile (2017-2019), and has over 10 years of Unity development experience across mobile, educational, and multiplayer titles.