Implement ML-based card detection and classification

2026-05-08 23:30:57 +02:00 · 2026-05-08 23:30:57 +02:00 · 073d395ae6
commit 073d395ae6
parent 9b427d2df8
5 changed files with 274 additions and 81 deletions
--- a/ML_SETUP_GUIDE.md
+++ b/ML_SETUP_GUIDE.md
@ -0,0 +1,81 @@
+# Jass Card Detection ML Setup Guide
+
+This document outlines the strategy and implementation details for transitioning the card detection system from basic color thresholding to a deep learning-based object recognition pipeline.
+
+## 1. Architecture Overview
+The system uses a **Hybrid Pipeline** to balance performance on mobile devices with high accuracy.
+
+**Workflow:**
+`Camera Feed` $\rightarrow$ `Image Processing (Localization)` $\rightarrow$ `Card Cropping` $\rightarrow$ `ML Classifier (Identity)` $\rightarrow$ `Game State Update`
+
+1.  **Localization**: Uses brightness and shape analysis to identify bounding boxes of cards (existing logic in `Detection.tsx`).
+2.  **Classification**: Crops each detected card and passes it through two specialized TensorFlow.js models to determine the **Suit** and the **Value**.
+
+## 2. Data Collection & Labeling
+Because Jass cards have unique iconography, we use a custom dataset created from internet samples.
+
+### Labeling Strategy: Folder-Based Annotation
+Instead of using bounding-box tools, we use the directory structure as labels.
+
+**Directory Structure:**
+```text
+dataset/
+├── suit_model/
+│   ├── Schellen/  (Bells)
+│   ├── Schilten/  (Shields)
+│   ├── Eicheln/   (Acorns)
+│   └── Rosen/     (Roses)
+└── value_model/
+    ├── 6/
+    ├── 7/
+    ...
+    └── 13/
+```
+
+### Process:
+1.  **Sourcing**: Collect 30-50 images per class from Swiss-German Jass deck galleries.
+2.  **Cropping**: Crop the center of the card for the Suit model and the corners for the Value model.
+3.  **Augmentation**: Use a script to generate variations:
+    *   Rotations ($\pm 15^\circ$)
+    *   Brightness/Contrast shifts
+    *   Gaussian noise to simulate mobile camera grain.
+
+## 3. Model Training (Python)
+We use **Transfer Learning** to minimize the required dataset size.
+
+*   **Base Model**: MobileNetV2 (pre-trained on ImageNet).
+*   **Modification**: Remove the final 1000-class layer and replace it with a Dense layer matching the number of labels (4 for suits, 8 for values).
+*   **Optimization**:
+    *   Loss: `categorical_crossentropy`
+    *   Optimizer: `Adam`
+    *   Regularization: Dropout (0.2) to prevent overfitting.
+*   **Target Size**: 64x64 pixels.
+
+## 4. Conversion & Deployment
+Once trained in Python, the models are converted to the TensorFlow.js format.
+
+**Conversion Command:**
+```bash
+tensorflowjs_converter --input_format=keras /path/to/model.h5 /public/models/suit_model
+```
+
+**Deployment Path:**
+The app expects the models in the public directory:
+*   `/public/models/suit_model/model.json`
+*   `/public/models/value_model/model.json`
+
+## 5. Frontend Integration
+The integration is handled by `src/services/CardModelService.ts`.
+
+*   **Initialization**: Models are loaded in parallel during `App.jsx` mount.
+*   **Inference**:
+    *   Tensors are created from the `canvas` crop.
+    *   Normalization is applied (`div(255.0)`).
+    *   `tf.tidy()` is used to wrap operations and prevent WebGL memory leaks.
+*   **Fallback**: If models are missing or fail to load, the app automatically reverts to the legacy color-analysis detection to ensure the app remains functional.
+
+## 6. Validation Metrics
+To verify the setup, the following metrics should be measured:
+1.  **Inference Latency**: Time from "Scan Table" click to result (Target: < 500ms).
+2.  **Classification Accuracy**: Percentage of correct identifications on a hold-out test set.
+3.  **Memory Footprint**: GPU memory usage during scan.