arxiv_cs_cv 2026年2月10日

ScatSpotter —— 犬の糞検出用データセット

"ScatSpotter" -- A Dog Poop Detection Dataset

Translated: 2026/3/15 3:02:24

datasetobject-detectionimage-classificationcomputer-visionscat-spotter

Japanese Translation

arXiv:2412.16473v2 発表タイプ: 差し替え要約：小さい不規則な廃棄物、例えば排泄物やマイクロプラスチックは、特に混雑したシーンでは見つけることが困難であるが、環境衛生、公衆衛生、および自律的な清掃にとって重要である。我々は「ScatSpotter」を提示する：これは、小型かつ潜在的に隠れる屋外廃棄物のオブジェクト検出およびセグメンテーションシステムのトレーニングと研究のために収集された、犬の糞を円形マーカーでアノテートした画像からなる新しいデータセットである。我々は主に都市環境でデータを収集した。「Before/After/Negative（BA/AN）」プロトコルを使用し、対象が存在する画像、その除去後の同一視点からの画像、そして視覚的に類似した誤認を誘発する近接するネガティブシーンを捕まえた。画像収集は 2020 年に開始された。本稿は 2025 年および 2024 年の 2 つのデータセットチェックポイントに焦点を当てている。このデータセットには 9,000 超の画像と 6,000 個の多角形アノテーションが含まれる。作者が撮影した画像のうち、691 枚を検証用として留保し、残りをトレーニングに使用した。コミュニティ参加により、写真家、機器、場所の一般化信頼性を確保する独立した 121 枚の画像からなるテストセットを取得した。そのサイズが限られているため、検証およびテストの結果を両方報告する。スチールモデルの VIT、MaskRCNN、YOLO-v9、および DINO-v2 を用いて、データセットの難易度を探求した。ゼロショット DINO のパフォーマンスは低く、このカテゴリに対する基礎モデルのカバーが限られていることを示している。チューニングされた DINO は、691 枚の画像からなる検証セットにおいて 0.69 のボックスレベルの平均精度、テストセットにおいて 0.70 を達成し、最良のパフォーマンスを示した。これらの結果は強力な基準を確立し、小型の隠れる廃棄物を検出する残存の難易度を定量化した。オープンアクセスモデルとデータの支援のために、中央集約型および分散型分布メカニズムを比較し、科学データ共有のトレードオフについて議論する。コードとプロジェクトの詳細は GitHub でホストされている。

Original Content

arXiv:2412.16473v2 Announce Type: replace Abstract: Small, amorphous waste objects such as biological droppings and microtrash can be difficult to see, especially in cluttered scenes, yet they matter for environmental cleanliness, public health, and autonomous cleanup. We introduce "ScatSpotter": a new dataset of images annotated with polygons around dog feces, collected to train and study object detection and segmentation systems for small potentially camouflaged outdoor waste. We gathered data in mostly urban environments, using "before/after/negative" (BAN) protocol: for a given location, we capture an image with the object present, an image from the same viewpoint after removal, and a nearby negative scene that often contains visually similar confusers. Image collection began in 2020. This paper focuses on two dataset checkpoints from 2025 and 2024. The dataset contains over 9000 images and 6000 polygon annotations. Of the author-captured images we held out 691 for validation and used the rest to train. Via community participation we obtained a 121-image test set that, while small, is independent from author-collected images and provides some generalization confidence across photographers, devices, and locations. Due to its limited size, we report both validation and test results. We explore the difficulty of the dataset using off-the-shelf VIT, MaskRCNN, YOLO-v9, and DINO-v2 models. Zero-shot DINO performs poorly, indicating limited foundational-model coverage of this category. Tuned DINO is the best model with a box-level average precision of 0.69 on a 691-image validation set and 0.7 on the test set. These results establish strong baselines and quantify the remaining difficulty of detecting small, camouflaged waste objects. To support open access to models and data, we compare centralized and decentralized distribution mechanisms and discuss trade-offs for sharing scientific data. Code and project details are hosted on GitHub.