This tutorial will include YOLOv8 OBB and YOLOv8. The tutorial will provide code with explanations, therefore you will need:
- A best.pt file of the model training of YOLOv8 OBB or YOLOv8
- An IDE (preferably Visual Studio Code)
- The IDE will also require Python installed
pip install torch torchvision ultralytics opencv-python numpy
into a Terminal
YOLOv8 OBB Implementation
Import Libraries: In your Python script, start by importing the required libraries:
import cv2 as cv
import torch
from ultralytics import YOLO
cv2
: Used for capturing video and processing images.torch
: Required for running the YOLOv8 OBB model.YOLO
: The specific class for loading and using the YOLOv8 OBB model.
Initialise Camera and Model: Set up the camera and load the YOLOv8 OBB or YOLOv8 model:
camera_index = 1 # Use 0 for the default camera
model = YOLO(r"path/to/best.pt") # Path to your model
camera_index
: Determines which camera to use (0 is usually the default).model
: Loads the pre-trained YOLOv8 OBB model from the specified path.
Start Video Capture: Begin capturing video from the camera:
cap = cv.VideoCapture(camera_index)
cap
: An object that represents the video capture. You can read frames from this object.
Define the Region of Interest (ROI): Specify the coordinates and dimensions of the ROI:
roi_x, roi_y, roi_w, roi_h = 515, 0, 250, 720 # x, y, width, height
roi_x
androi_y
: The top-left corner of the ROI.roi_w
androi_h
: The width and height of the ROI.
Main Loop for Processing Frames: Create a loop that continuously captures frames from the camera:
while True:
ret, frame = cap.read()
if not ret:
break
ret
: A boolean that indicates if the frame was successfully captured.frame
: The captured frame from the camera.
Crop the ROI from the Frame: Extract the ROI from the captured frame:
roi_frame = frame[roi_y:roi_y + roi_h, roi_x:roi_x + roi_w] # Crop ROI
- This selects the portion of the frame that corresponds to the defined ROI.
Run the YOLOv8 Model: Pass the cropped ROI to the YOLOv8 model for detection:
results = model(roi_frame, verbose=False, conf=0.75) # Run model
results
: Contains the detection results, including bounding boxes and class labels.conf
: Sets the confidence threshold for detections (0.75 means only detections with confidence above 75% will be considered).
Annotate Results on the Original Frame: If objects are detected, draw bounding boxes and labels on the original frame:
if results and results[0].obb:
for obb in results[0].obb:
vertices = obb.xyxyxyxy[0].cpu().numpy()
vertices[:, 0] += roi_x
vertices[:, 1] += roi_y
points = vertices.astype(int)
obb
: Refers to the oriented bounding box for each detected object.vertices
: The coordinates of the bounding box vertices.- The coordinates are adjusted to account for the ROI position.
Draw the Bounding Box and Labels: Visualise the results by drawing lines and text on the original frame:
for j in range(len(points)):
cv.line(frame, tuple(points[j]), tuple(points[(j + 1) % len(points)]), (0, 255, 0), 2)
label = results[0].names[int(obb.cls[0])]
cv.putText(frame, label, (points[0][0], points[0][1] - 10), cv.FONT_HERSHEY_SIMPLEX, 0.9, (255, 255, 255), 2)
cv.line
: Draws the bounding box around the detected object.cv.putText
: Displays the class label near the detected object.
Draw the ROI Rectangle: Visually outline the ROI on the frame:
cv.rectangle(frame, (roi_x, roi_y), (roi_x + roi_w, roi_y + roi_h), (0, 255, 0), 2) # Draw ROI
Display the Annotated Frame: Show the updated frame with annotations in a window:
cv.imshow("YOLOv8 OBB Detection", frame)
Exit on Key Press: Add functionality to exit the loop:
if cv.waitKey(1) == ord('q'):
break
Release Resources: At the end of your script, ensure that you release the video capture and close all windows:
cap.release()
cv.destroyAllWindows()
Here is the full code for the YOLOv8 OBB detection within a ROI:
import cv2 as cv
import torch
from ultralytics import YOLO
camera_index = 1
model = YOLO(r"path/to/best.pt")
cap = cv.VideoCapture(camera_index)
roi_x, roi_y, roi_w, roi_h = 515, 0, 250, 720
while True:
ret, frame = cap.read()
if not ret:
break
roi_frame = frame[roi_y:roi_y + roi_h, roi_x:roi_x + roi_w]
results = model(roi_frame, verbose=False, conf=0.75)
if results and results[0].obb:
for obb in results[0].obb:
vertices = obb.xyxyxyxy[0].cpu().numpy()
vertices[:, 0] += roi_x
vertices[:, 1] += roi_y
points = vertices.astype(int)
for j in range(len(points)):
cv.line(frame, tuple(points[j]), tuple(points[(j + 1) % len(points)]), (0, 255, 0), 2)
label = results[0].names[int(obb.cls[0])]
cv.putText(frame, label, (points[0][0], points[0][1] - 10), cv.FONT_HERSHEY_SIMPLEX, 0.9, (255, 255, 255), 2)
cv.rectangle(frame, (roi_x, roi_y), (roi_x + roi_w, roi_y + roi_h), (0, 255, 0), 2)
cv.imshow("YOLOv8 OBB Detection", frame)
if cv.waitKey(1) == ord('q'):
break
cap.release()
cv.destroyAllWindows()
YOLOv8 Implementation:
Import Libraries: Start your Python script by importing the necessary libraries:
import cv2 as cv
import torch
from ultralytics import YOLO
Initialise Camera and Model: Set up the camera and load the YOLOv8 model:
camera_index = 1 # Use 0 for the default camera
model = YOLO(r"C:/path/to/best.pt") # Path to your YOLOv8 model
camera_index
: Selects which camera to use.model
: Loads the YOLOv8 model from the specified path.
Start Video Capture: Begin capturing video from the camera:
cap = cv.VideoCapture(camera_index)
Define the Region of Interest (ROI): Specify the coordinates and dimensions for the ROI:
roi_x, roi_y, roi_w, roi_h = 515, 0, 250, 720 # x, y, width, height
Main Loop for Processing Frames: Create a loop to continuously capture frames:
while True:
ret, frame = cap.read()
if not ret:
break
Crop the ROI from the Frame: Extract the ROI from the captured frame:
roi_frame = frame[roi_y:roi_y + roi_h, roi_x:roi_x + roi_w] # Crop ROI
Run the YOLOv8 Model: Pass the cropped ROI to the YOLOv8 model for detection:
results = model(roi_frame, verbose=False, conf=0.75) # Run model
Annotate Results on the Original Frame: If objects are detected, draw bounding boxes and labels on the original frame:
if results:
for box in results[0].boxes.xyxy: # Access bounding boxes
x1, y1, x2, y2 = map(int, box) # Convert to integers
cv.rectangle(frame, (x1 + roi_x, y1 + roi_y), (x2 + roi_x, y2 + roi_y), (0, 255, 0), 2)
label = results[0].names[int(box.cls)] # Get class name
cv.putText(frame, label, (x1 + roi_x, y1 + roi_y - 10), cv.FONT_HERSHEY_SIMPLEX, 0.9, (255, 255, 255), 2)
Draw the ROI Rectangle: Outline the ROI on the original frame:
cv.rectangle(frame, (roi_x, roi_y), (roi_x + roi_w, roi_y + roi_h), (0, 255, 0), 2) # Draw ROI
Display the Annotated Frame: Show the updated frame with annotations:
cv.imshow("YOLOv8 Detection", frame)
Exit on Key Press: Add functionality to exit the loop:
if cv.waitKey(1) == ord('q'):
break
Release Resources: At the end of your script, release the video capture and close all windows:
cap.release()
cv.destroyAllWindows()
Here is the full code for the YOLOv8 detection within a ROI:
import cv2 as cv
import torch
from ultralytics import YOLO
camera_index = 1
model = YOLO(r"C:/path/to/best.pt")
cap = cv.VideoCapture(camera_index)
roi_x, roi_y, roi_w, roi_h = 515, 0, 250, 720
while True:
ret, frame = cap.read()
if not ret:
break
roi_frame = frame[roi_y:roi_y + roi_h, roi_x:roi_x + roi_w]
results = model(roi_frame, verbose=False, conf=0.75)
if results:
for box in results[0].boxes.xyxy:
x1, y1, x2, y2 = map(int, box)
cv.rectangle(frame, (x1 + roi_x, y1 + roi_y), (x2 + roi_x, y2 + roi_y), (0, 255, 0), 2)
label = results[0].names[int(box.cls)]
cv.putText(frame, label, (x1 + roi_x, y1 + roi_y - 10), cv.FONT_HERSHEY_SIMPLEX, 0.9, (255, 255, 255), 2)
cv.rectangle(frame, (roi_x, roi_y), (roi_x + roi_w, roi_y + roi_h), (0, 255, 0), 2)
cv.imshow("YOLOv8 Detection", frame)
if cv.waitKey(1) == ord('q'):
break
cap.release()
cv.destroyAllWindows()