Experimental Detection API¶
obia.detection is experimental. Its interfaces and workflow may change in future releases.
obia.detection.dataset¶
obia.detection.dataset
¶
TreeDetectionDataset
¶
Bases: Dataset
Represents a dataset for tree detection tasks.
This class handles loading, preprocessing, and transforming tree detection datasets. Images and annotations are loaded and preprocessed for deep learning models. It supports geometric and color augmentations if transforms are provided, and optional scaling of pixel values.
:ivar images_dir: Path to the directory containing image files.
:type images_dir: str
:ivar annotations: Parsed annotations for the dataset, loaded from the JSON file.
:type annotations: dict
:ivar image_ids: List of image IDs corresponding to the keys in the annotations.
:type image_ids: list
:ivar transforms: A callable for data augmentation and transformations. It must support
the image, bboxes, and labels keys for input and output.
:type transforms: callable, optional
:ivar do_scale: Whether to scale image pixel values to the range 0-255.
:type do_scale: bool
Source code in obia/detection/dataset.py
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 | |
obia.detection.models¶
obia.detection.models
¶
RetinaNet-based Detection Model (Modified for N-Channel Input)¶
Allows specifying 'in_channels' for multi-band data. By default, the pretrained backbone is for 3 channels. We replace the first conv layer to match in_channels, partially or fully copying weights for the first 3 channels if in_channels >= 3.
build_detection_model(num_classes=2, in_channels=3)
¶
Builds a RetinaNet model with optional adjustment for N-band input.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
num_classes
|
int
|
Number of classes (including background if you prefer). |
2
|
in_channels
|
int
|
Number of input channels (e.g., 4 for RGB+CHM). |
3
|
Returns:
| Name | Type | Description |
|---|---|---|
model |
Module
|
The modified RetinaNet model. |
Source code in obia/detection/models.py
19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 | |
obia.detection.train¶
obia.detection.train
¶
Training Script for a RetinaNet-based Detection Model¶
train_model(model, train_loader, num_epochs, device='cpu')
¶
Trains the RetinaNet detection model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model
|
Module
|
The detection model (e.g., from build_detection_model). |
required |
train_loader
|
DataLoader
|
DataLoader for training data. |
required |
num_epochs
|
int
|
Number of epochs to train. |
required |
device
|
str
|
Device to use ("cpu", "cuda", or "mps"). |
'cpu'
|
Returns:
| Name | Type | Description |
|---|---|---|
model |
Module
|
Trained model. |
Source code in obia/detection/train.py
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 | |
obia.detection.predict¶
obia.detection.predict
¶
Object Detection Prediction Script (Multi-band + 0..255 Scaling)¶
Provides a function predict() for running inference with a custom
RetinaNet-based model on an N-band raster. Uses rasterio to read
the data, scales each band to [0..255], then feeds to the model.
predict(model, image_path, device='cpu', score_threshold=0.5)
¶
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model
|
Module
|
Trained RetinaNet model (with in_channels matching your data). |
required |
image_path
|
str
|
Path to the multi-band raster (GeoTIFF, etc.). |
required |
device
|
str
|
"cpu", "cuda", or "mps". |
'cpu'
|
score_threshold
|
float
|
Minimum confidence for detection. |
0.5
|
Returns:
| Type | Description |
|---|---|
dict
|
dict with { "boxes": nd.array, "scores": nd.array, "labels": nd.array } |
Source code in obia/detection/predict.py
14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 | |
obia.detection.utils¶
obia.detection.utils
¶
Transforms, Collate Function, and Utility Helpers¶
Provides: - get_transforms(): Albumentations pipelines for train/val - collate_fn(): custom collate for object detection - calculate_iou(): compute IoU between two boxes - visualize_predictions(): draw detection boxes and scores on an image
calculate_iou(box1, box2)
¶
Compute Intersection over Union (IoU) for two bounding boxes. Boxes assumed in format [x_min, y_min, x_max, y_max].
Source code in obia/detection/utils.py
63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 | |
collate_fn(batch)
¶
Custom collate function for object detection in PyTorch. Returns lists of images and targets.
Source code in obia/detection/utils.py
50 51 52 53 54 55 56 57 58 59 60 | |
get_transforms(train=True)
¶
Returns Albumentations transforms for bounding-box tasks. If you have more than 3 channels, consider removing any 3-channel-specific Normalization or adjusting mean/std.
Source code in obia/detection/utils.py
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 | |
visualize_predictions(image_path, detection_output, score_threshold=0.0)
¶
Draws bounding boxes (and scores) on an image and displays it using matplotlib.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
image_path
|
str
|
Path to the image file. |
required |
detection_output
|
dict
|
Must contain "boxes", "scores", "labels". |
required |
score_threshold
|
float
|
Only visualize boxes with score >= threshold. |
0.0
|
Source code in obia/detection/utils.py
84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 | |