02Computer Vision

Classroom Behavior Detection System

Production-grade classroom behavior detection system trained on the SCB-05 dataset using YOLOv8x on Google Colab Pro (A100 GPU). Achieves 74.85% mAP@0.5 overall and 93.5% detection accuracy on sleeping behavior across 11+ behavioral classes. Includes tiled inference pipeline for live CCTV/RTSP feeds.

YOLOv8xPyTorchGoogle ColabRoboflowGrad-CAMOpenCVPythonYAML
View on GitHubColab Notebook
74.85%
mAP@0.5
93.5%
Sleep Detection
11+
Behavior Classes
type
Computer Vision
status
Active Research
year
2025
role
Research Assistant
01

System Architecture · 3D View

02

Architecture Diagram

SCB-05 Dataset
Roboflow Annotated
Augmentation
Resize · Flip · Mosaic
YOLOv8x
A100 · Colab Pro
Training Loop
YAML Config · GPU
Evaluation
mAP@0.5 = 74.85%
Grad-CAM
Explainability
Tiled Inference
CCTV · RTSP Feed
Detection Output
Bounding Boxes · Live
03

Training Results & Proof

74.85%
mAP@0.5
Best epoch 24
94.8%
Sleeping AP
Highest class
88.8%
Phone AP
Using phone class
150
Epochs
A100 · Colab Pro
Training & Validation Curves · 150 epochs
Click to expand
Training curves
Per-Class AP
Sleeping94.8%
Using Phone88.8%
Reading68.8%
Hand Raising66.3%
Writing49.5%
Overall mAP@0.5
74.85%
Best at epoch 24 / 150
Precision-Recall Curve
PR Curve
04

Screenshots & Output

terminal
$ yolo detect train data=scb05.yaml model=yolov8x.pt epochs=50
Device: A100 SXM4 80GB · Google Colab Pro
Epoch 1/50: box_loss=4.12 cls_loss=3.87
Epoch 25/50: mAP50=0.61 Precision=0.68
Epoch 40/50: mAP50=0.72 Precision=0.78
Epoch 50/50: mAP50=0.7485 Precision=0.812
✓ Best model saved → runs/detect/train/weights/best.pt
✓ Sleeping class AP: 0.935 — highest per-class score
$ python tiled_inference.py --source classroom.mp4
✓ Tiled inference complete → output_tiled.mp4
Training Output
YOLOv8x Colab A100 training logs
Class mAP Scores
Overall mAP@0.575%
Sleeping94%
Precision81%
Recall74%
Using Phone68%
Class mAP Scores
Per-class detection performance
YAML Config
{
# scb05.yaml
path: ./SCB-05
nc: 11
names: [sleeping, writing, phone, reading...]
# model=yolov8x.pt epochs=50 best_mAP50=0.7485
}
Training Config
YAML training configuration
Project Structure
📁 classroom-behavior-detection/
├─ YOLOv8x_SCB05.ipynb Colab notebook
├─ tiled_inference.py CCTV pipeline
├─ gradcam_viz.py Explainability
├─ SCB-05/ Roboflow export
├─ scb05.yaml Dataset config
└─ runs/detect/weights/ best.pt
Dataset Structure
SCB-05 Roboflow directory layout
05

What I Built

Trained YOLOv8x on the SCB-05 dataset using Google Colab Pro with A100 GPU, achieving 74.85% mAP@0.5 across 11+ behavioral classes.

Achieved 93.5% detection accuracy on sleeping behavior — the highest per-class precision in the model.

Built tiled inference pipeline for live CCTV/RTSP feed processing, enabling real-time classroom monitoring.

Curated and annotated multi-class dataset via Roboflow with rigorous quality control across hundreds of images.

Configured reproducible YAML-based training pipelines with systematic hyperparameter optimization (batch size, image size, epochs).

Integrated Grad-CAM explainability to validate that the model focuses on posture and body position rather than facial features.

Designed architecture for deployment on resource-constrained campus hardware with streaming video input.

06

Project Insights

Personal Notes & Learnings
Markdown Editor
Live Preview

Research Context

This is my primary research assistantship project at Lawrence Technological University — a production-grade classroom behavior detection system trained on the SCB-05 dataset.

Key Results

  • 74.85% mAP@0.5 overall across 11+ behavioral classes
  • 93.5% accuracy on sleeping detection — highest per-class precision
  • Trained on Google Colab Pro (A100 GPU) for fast iteration

Technical Highlights

  • YOLOv8x (extra-large model) chosen for maximum accuracy over speed
  • Roboflow used for dataset curation, annotation, and augmentation pipeline
  • Tiled inference enables deployment on wide-angle CCTV footage without losing small-object detection
  • Grad-CAM confirmed model attends to posture/body position, not faces — responsible AI validated

Next Steps

  • Streamlit analytics dashboard for live monitoring
  • Real-time RTSP stream integration for campus deployment
  • Paper submission to IJSRST (Under Review)
✓ Insights saved locally