Real-Time Object Detection on Raspberry Pi Using YOLOv5 and OpenCV

Click to rate this post!
[Total: 1 Average: 5]

Object detection is a fascinating area of computer vision that allows computers to identify and locate objects within images or video streams. This post will guide you through setting up real-time object detection on a Raspberry Pi using YOLOv5 and OpenCV. We’ll also handle warnings effectively and focus on detecting specific objects like persons, cars, motorcycles, buses, and trucks within a defined region of interest (ROI).


  1. Raspberry Pi: Ensure you have a Raspberry Pi with internet access.
  2. Python: Python should be installed on your system.
  3. OpenCV: Install OpenCV using pip install opencv-python.
  4. Torch: Install Torch using pip install torch.
  5. YOLOv5: We’ll use the YOLOv5 model from Ultralytics.

Step-by-Step Guide

1. Set Up the Environment

First, ensure that you have the necessary Python packages installed:

pip install opencv-python torch

2. Define the Object Detection Script

Here’s the Python script that captures video from an RTSP stream, processes each frame to detect objects, and displays the results with a region of interest (ROI). We have also added steps to suppress unwanted warnings and optimize the performance.

import logging
import cv2
import torch
import time
import os

# Configure custom logger to capture only warnings related to HEVC codec
logging.basicConfig(level=logging.WARNING, format='%(levelname)s - %(message)s')
logger = logging.getLogger('HEVC_Warnings')

# Load YOLOv5 model (small version for better performance on Raspberry Pi)
model = torch.hub.load('ultralytics/yolov5', 'yolov5s', pretrained=True)

# Define vehicle classes (YOLOv5 class indices for vehicles)
classes = [0, 2, 3, 5, 7]  # Indices for person, car, motorcycle, bus, and truck

# Access the RTSP stream
rtsp_url = "rtsp://<login>:<password>@<ip_address>:554/h264Preview_01_main"
cap = cv2.VideoCapture(rtsp_url)

# Define region of interest (ROI) coordinates (x, y, width, height)
roi_x, roi_y, roi_width, roi_height = 100, 100, 400, 300

def detect_and_signal(frame):
    results = model(frame)
    detections = results.xyxy[0]  # Extract detections

    vehicle_detected = False
    for *box, conf, cls in detections:
        if int(cls) in classes:
            vehicle_detected = True
            if int(cls) == 0:
                print("Person detected")
            elif int(cls) == 2:
                print("Car detected")
            elif int(cls) == 7:
                print("Truck detected")
                print("Other vehicle detected: " + cls)

            # Draw rectangle around ROI
            cv2.rectangle(frame, (roi_x, roi_y), (roi_x + roi_width, roi_y + roi_height), (255, 0, 0), 2)

            # Draw rectangle for visualization around detected object
            cv2.rectangle(frame, (int(box[0]), int(box[1])), (int(box[2]), int(box[3])), (0, 255, 0), 2)
            # Add class label and confidence score
            label = f'{model.names[int(cls)]} {conf:.2f}'
            cv2.putText(frame, label, (int(box[0]), int(box[1]) - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)

    # Send signal if any vehicle is detected
    # GPIO.output(18, GPIO.HIGH if vehicle_detected else GPIO.LOW)

# Set skip factor (process every nth frame)
skip_factor = 50  # Process every 50th frame
frame_count = 0

# Redirect stderr to /dev/null to suppress warnings
os.environ['FFREPORT'] = 'file=/dev/null'

while True:
    ret, frame =
    if not ret or frame is None:
        print("Failed to capture frame. Retrying...")

    # Skip frames
    if frame_count % skip_factor == 0:
            # Resize frame to reduce resolution
            frame_resized = cv2.resize(frame, (640, 480))
            # Draw square around the ROI
            cv2.rectangle(frame_resized, (roi_x, roi_y), (roi_x + roi_width, roi_y + roi_height), (255, 0, 0), 2)

            # Extract region of interest (ROI) from the frame
            roi = frame_resized[roi_y:roi_y + roi_height, roi_x:roi_x + roi_width]
            # Resize ROI for better performance
            roi_resized = cv2.resize(roi, (640, 480))
        except cv2.error as e:
            print(f"Error processing ROI: {e}")

        start_time = time.time()
        end_time = time.time()
        print(f"Frame processed in {end_time - start_time:.2f} seconds")

        # Display the processed frame with ROI and detections (for debugging)
        cv2.imshow('Frame', frame_resized)

    frame_count += 1

    if cv2.waitKey(1) & 0xFF == ord('q'):

# GPIO.cleanup()

(Optional) Other optimization

in you we want to optimize even more the script, it is possible to redule image resolution: e.g. 192×192 pixel and roi_x, roi_y, roi_width, roi_height = 50, 50, 100, 100

-> roi_x, roi_y, roi_width, roi_height = 50, 50, 100, 100
-> frame_resized = cv2.resize(frame, (192, 192))-> roi_resized = cv2.resize(roi, (192, 192))


  1. Model Loading: We load the YOLOv5 model using torch.hub.load. The model is set to detect objects like persons, cars, motorcycles, buses, and trucks.
  2. RTSP Stream Access: The video stream is accessed using OpenCV’s cv2.VideoCapture.
  3. Region of Interest (ROI): We define a specific region in the frame where we want to detect objects.
  4. Frame Skipping: To reduce CPU load, we process every 50th frame.
  5. Warning Suppression: We redirect stderr to /dev/null to suppress HEVC codec warnings.
  6. Object Detection and Annotation: Detected objects within the ROI are annotated and displayed.
  7. Performance Optimization: The frame is resized for better performance before processing.



This setup allows you to run real-time object detection on a Raspberry Pi with optimized CPU usage. The code processes every 50th frame to reduce load and focuses on a defined ROI for targeted detection. By following these steps, you can effectively implement and optimize object detection for various applications.

Feel free to modify the code to suit your specific needs, such as adjusting the ROI, skip factor, or detected object classes. Happy coding!

5 1 vote
Article Rating
Notify of

Inline Feedbacks
View all comments
Would love your thoughts, please comment.x