Implement ML-based card detection and classification
This commit is contained in:
parent
9b427d2df8
commit
073d395ae6
5 changed files with 274 additions and 81 deletions
81
ML_SETUP_GUIDE.md
Normal file
81
ML_SETUP_GUIDE.md
Normal file
|
|
@ -0,0 +1,81 @@
|
|||
# Jass Card Detection ML Setup Guide
|
||||
|
||||
This document outlines the strategy and implementation details for transitioning the card detection system from basic color thresholding to a deep learning-based object recognition pipeline.
|
||||
|
||||
## 1. Architecture Overview
|
||||
The system uses a **Hybrid Pipeline** to balance performance on mobile devices with high accuracy.
|
||||
|
||||
**Workflow:**
|
||||
`Camera Feed` $\rightarrow$ `Image Processing (Localization)` $\rightarrow$ `Card Cropping` $\rightarrow$ `ML Classifier (Identity)` $\rightarrow$ `Game State Update`
|
||||
|
||||
1. **Localization**: Uses brightness and shape analysis to identify bounding boxes of cards (existing logic in `Detection.tsx`).
|
||||
2. **Classification**: Crops each detected card and passes it through two specialized TensorFlow.js models to determine the **Suit** and the **Value**.
|
||||
|
||||
## 2. Data Collection & Labeling
|
||||
Because Jass cards have unique iconography, we use a custom dataset created from internet samples.
|
||||
|
||||
### Labeling Strategy: Folder-Based Annotation
|
||||
Instead of using bounding-box tools, we use the directory structure as labels.
|
||||
|
||||
**Directory Structure:**
|
||||
```text
|
||||
dataset/
|
||||
├── suit_model/
|
||||
│ ├── Schellen/ (Bells)
|
||||
│ ├── Schilten/ (Shields)
|
||||
│ ├── Eicheln/ (Acorns)
|
||||
│ └── Rosen/ (Roses)
|
||||
└── value_model/
|
||||
├── 6/
|
||||
├── 7/
|
||||
...
|
||||
└── 13/
|
||||
```
|
||||
|
||||
### Process:
|
||||
1. **Sourcing**: Collect 30-50 images per class from Swiss-German Jass deck galleries.
|
||||
2. **Cropping**: Crop the center of the card for the Suit model and the corners for the Value model.
|
||||
3. **Augmentation**: Use a script to generate variations:
|
||||
* Rotations ($\pm 15^\circ$)
|
||||
* Brightness/Contrast shifts
|
||||
* Gaussian noise to simulate mobile camera grain.
|
||||
|
||||
## 3. Model Training (Python)
|
||||
We use **Transfer Learning** to minimize the required dataset size.
|
||||
|
||||
* **Base Model**: MobileNetV2 (pre-trained on ImageNet).
|
||||
* **Modification**: Remove the final 1000-class layer and replace it with a Dense layer matching the number of labels (4 for suits, 8 for values).
|
||||
* **Optimization**:
|
||||
* Loss: `categorical_crossentropy`
|
||||
* Optimizer: `Adam`
|
||||
* Regularization: Dropout (0.2) to prevent overfitting.
|
||||
* **Target Size**: 64x64 pixels.
|
||||
|
||||
## 4. Conversion & Deployment
|
||||
Once trained in Python, the models are converted to the TensorFlow.js format.
|
||||
|
||||
**Conversion Command:**
|
||||
```bash
|
||||
tensorflowjs_converter --input_format=keras /path/to/model.h5 /public/models/suit_model
|
||||
```
|
||||
|
||||
**Deployment Path:**
|
||||
The app expects the models in the public directory:
|
||||
* `/public/models/suit_model/model.json`
|
||||
* `/public/models/value_model/model.json`
|
||||
|
||||
## 5. Frontend Integration
|
||||
The integration is handled by `src/services/CardModelService.ts`.
|
||||
|
||||
* **Initialization**: Models are loaded in parallel during `App.jsx` mount.
|
||||
* **Inference**:
|
||||
* Tensors are created from the `canvas` crop.
|
||||
* Normalization is applied (`div(255.0)`).
|
||||
* `tf.tidy()` is used to wrap operations and prevent WebGL memory leaks.
|
||||
* **Fallback**: If models are missing or fail to load, the app automatically reverts to the legacy color-analysis detection to ensure the app remains functional.
|
||||
|
||||
## 6. Validation Metrics
|
||||
To verify the setup, the following metrics should be measured:
|
||||
1. **Inference Latency**: Time from "Scan Table" click to result (Target: < 500ms).
|
||||
2. **Classification Accuracy**: Percentage of correct identifications on a hold-out test set.
|
||||
3. **Memory Footprint**: GPU memory usage during scan.
|
||||
Loading…
Add table
Add a link
Reference in a new issue