
198 lines
10 KiB
Executable File

<table style="width:100%">
<img src="">
<td align="center">
<a href="" target="_blank">
<img src="" width="160"></a>
<img src="">
<a href="" target="_blank">
<img src="" width="180"></a>
<img src="">
## Introduction
The repo contains inference and training code for YOLOv3 in PyTorch. The code works on Linux, MacOS and Windows. Training is done on the COCO dataset by default: **Credit to Joseph Redmon for YOLO:**
## Requirements
Python 3.7 or later with all `requirements.txt` dependencies installed, including `torch >= 1.5`. To install run:
$ pip install -U -r requirements.txt
## Tutorials
* <a href=""><img src="" alt="Open In Colab"></a>
* [Train Custom Data]( < highly recommended!!
* [GCP Quickstart](
* [Docker Quickstart Guide](
* [A TensorRT Implementation of YOLOv3 and YOLOv4](
## Training
**Start Training:** `python3` to begin training after downloading COCO data with `data/`. Each epoch trains on 117,263 images from the train and validate COCO sets, and tests on 5000 images from the COCO validate set.
**Resume Training:** `python3 --resume` to resume training from `weights/`.
**Plot Training:** `from utils import utils; utils.plot_results()`
<img src="" width="900">
### Image Augmentation
`` applies OpenCV-powered ( augmentation to the input image. We use a **mosaic dataloader** to increase image variability during training.
<img src="" width="900">
### Speed
**Machine type:** preemptible [n1-standard-8]( (8 vCPUs, 30 GB memory)
**CPU platform:** Intel Skylake
**GPUs:** K80 ($0.14/hr), T4 ($0.11/hr), V100 ($0.74/hr) CUDA with [Nvidia Apex]( FP16/32
**HDD:** 300 GB SSD
**Dataset:** COCO train 2014 (117,263 images)
**Model:** `yolov3-spp.cfg`
**Command:** `python3 --data --img 416 --batch 32`
GPU | n | `--batch-size` | img/s | epoch<br>time | epoch<br>cost
--- |--- |--- |--- |--- |---
K80 |1| 32 x 2 | 11 | 175 min | $0.41
T4 |1<br>2| 32 x 2<br>64 x 1 | 41<br>61 | 48 min<br>32 min | $0.09<br>$0.11
V100 |1<br>2| 32 x 2<br>64 x 1 | 122<br>**178** | 16 min<br>**11 min** | **$0.21**<br>$0.28
2080Ti |1<br>2| 32 x 2<br>64 x 1 | 81<br>140 | 24 min<br>14 min | -<br>-
## Inference
python3 --source ...
- Image: `--source file.jpg`
- Video: `--source file.mp4`
- Directory: `--source dir/`
- Webcam: `--source 0`
- RTSP stream: `--source rtsp://`
- HTTP stream: `--source`
**YOLOv3:** `python3 --cfg cfg/yolov3.cfg --weights`
<img src="" width="500">
**YOLOv3-tiny:** `python3 --cfg cfg/yolov3-tiny.cfg --weights`
<img src="" width="500">
**YOLOv3-SPP:** `python3 --cfg cfg/yolov3-spp.cfg --weights`
<img src="" width="500">
## Pretrained Checkpoints
Download from: [](
## Darknet Conversion
$ git clone && cd yolov3
# convert darknet cfg/weights to pytorch model
$ python3 -c "from models import *; convert('cfg/yolov3-spp.cfg', 'weights/yolov3-spp.weights')"
Success: converted 'weights/yolov3-spp.weights' to 'weights/'
# convert cfg/pytorch model to darknet weights
$ python3 -c "from models import *; convert('cfg/yolov3-spp.cfg', 'weights/')"
Success: converted 'weights/' to 'weights/yolov3-spp.weights'
## mAP
<i></i> |Size |COCO mAP<br>@0.5...0.95 |COCO mAP<br>@0.5
--- | --- | --- | ---
YOLOv3-tiny<br>YOLOv3<br>YOLOv3-SPP<br>**[YOLOv3-SPP-ultralytics](** |320 |14.0<br>28.7<br>30.5<br>**37.7** |29.1<br>51.8<br>52.3<br>**56.8**
YOLOv3-tiny<br>YOLOv3<br>YOLOv3-SPP<br>**[YOLOv3-SPP-ultralytics](** |416 |16.0<br>31.2<br>33.9<br>**41.2** |33.0<br>55.4<br>56.9<br>**60.6**
YOLOv3-tiny<br>YOLOv3<br>YOLOv3-SPP<br>**[YOLOv3-SPP-ultralytics](** |512 |16.6<br>32.7<br>35.6<br>**42.6** |34.9<br>57.7<br>59.5<br>**62.4**
YOLOv3-tiny<br>YOLOv3<br>YOLOv3-SPP<br>**[YOLOv3-SPP-ultralytics](** |608 |16.6<br>33.1<br>37.0<br>**43.1** |35.4<br>58.2<br>60.7<br>**62.8**
- mAP@0.5 run at `--iou-thr 0.5`, mAP@0.5...0.95 run at `--iou-thr 0.7`
- Darknet results:
$ python3 --cfg yolov3-spp.cfg --weights --img 640 --augment
Namespace(augment=True, batch_size=16, cfg='cfg/yolov3-spp.cfg', conf_thres=0.001, data='', device='', img_size=640, iou_thres=0.6, save_json=True, single_cls=False, task='test', weights='weight
Using CUDA device0 _CudaDeviceProperties(name='Tesla V100-SXM2-16GB', total_memory=16130MB)
Class Images Targets P R mAP@0.5 F1: 100%|█████████| 313/313 [03:00<00:00, 1.74it/s]
all 5e+03 3.51e+04 0.375 0.743 0.64 0.492
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.456
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.647
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.496
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.263
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.501
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.596
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.361
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.597
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.666
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.492
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.719
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.810
Speed: 17.5/2.3/19.9 ms inference/NMS/total per 640x640 image at batch-size 16
<!-- Speed: 11.4/2.2/13.6 ms inference/NMS/total per 608x608 image at batch-size 1 -->
## Reproduce Our Results
Run commands below. Training takes about one week on a 2080Ti per model.
$ python --data --weights '' --batch-size 16 --cfg yolov3-spp.cfg
$ python --data --weights '' --batch-size 32 --cfg yolov3-tiny.cfg
<img src="" width="900">
## Reproduce Our Environment
To access an up-to-date working environment (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled), consider a:
- **GCP** Deep Learning VM with $300 free credit offer: See our [GCP Quickstart Guide](
- **Google Colab Notebook** with 12 hours of free GPU time. <a href=""><img src="" alt="Open In Colab"></a>
- **Docker Image** See [Docker Quickstart Guide](
## Citation
## About Us
Ultralytics is a U.S.-based particle physics and AI startup with over 6 years of expertise supporting government, academic and business clients. We offer a wide range of vision AI services, spanning from simple expert advice up to delivery of fully customized, end-to-end production solutions, including:
- **Cloud-based AI** surveillance systems operating on **hundreds of HD video streams in realtime.**
- **Edge AI** integrated into custom iOS and Android apps for realtime **30 FPS video inference.**
- **Custom data training**, hyperparameter evolution, and model exportation to any destination.
For business inquiries and professional support requests please visit us at
## Contact
**Issues should be raised directly in the repository.** For business inquiries or professional support requests please visit or email Glenn Jocher at