06Computer Vision

Indoor Scene Recognition

A deployable indoor scene recognition system using a fine-tuned YOLOv5s model (14.4MB), served via both a REST API (restapi.py) and an interactive web application (webapp.py) built with Flask/Jinja2. Fully containerized with Docker for one-command deployment.

YOLOv5PyTorchPythonFlaskREST APIJinja2DockerHTML/CSS
View on GitHub
14.4MB
Model Size
2
Interfaces
Docker
Deployment
type
Computer Vision
status
Completed
year
2026
role
Solo Engineer
01

System Architecture · 3D View

02

Architecture Diagram

Image Input
Upload · URL · Test
Request Router
Flask · Python
REST API
restapi.py
Web App
webapp.py
YOLOv5s Model
yolov5s.pt · 14.4MB
Inference Engine
PyTorch · CUDA
JSON Response
Predictions · Scores
Web UI
Jinja2 · Templates
03

Screenshots & Output

terminal
$ docker build -t indoor-rec . && docker run -p 5000:5000 indoor-rec
✓ Image built successfully
✓ YOLOv5s model loaded → yolov5s.pt (14.4MB)
POST /predict HTTP/1.1
→ Preprocessing image: 640x640
→ Detected: bedroom 0.91, furniture 0.87, window 0.74
→ Response: 200 OK · 0.23s
Web UI available at http://localhost:5000
API Terminal Output
Flask server inference log
Detection Scores
Bedroom91%
Kitchen87%
Bathroom83%
Living Room78%
Office74%
Detection Scores
Per-class confidence scores
Data Output
{
class: bedroom,
confidence: 0.91,
inference_time_ms: 230,
model: yolov5s.pt,
image_size: 640x640
}
API Response JSON
REST endpoint output
Project Structure
📁 indoor-recognition-main/
├─ restapi.py REST API server
├─ webapp.py Web app server
├─ yolov5s.pt 14.4MB model
├─ Dockerfile Container config
├─ templates/ Jinja2 HTML
├─ tests/ pytest suite
└─ requirements.txt Dependencies
Project Structure
Dockerized repo layout
04

What I Built

Fine-tuned YOLOv5s model (yolov5s.pt, 14.4MB) for indoor scene recognition with custom class detection.

Built dual-interface serving layer: REST API (restapi.py) for programmatic access and interactive web app (webapp.py) for browser-based inference.

Designed Jinja2-powered frontend (templates/, static/) with real-time image upload and prediction display.

Containerized the full stack with Docker (Dockerfile) enabling one-command deployment anywhere.

Structured test suite (tests/, test-images/) for model inference validation and API endpoint testing.

Documented architecture and usage in docs/ with requirements.txt for reproducible environment setup.

05

Project Insights

Personal Notes & Learnings
Markdown Editor
Live Preview

What I Built

A production-deployable indoor scene recognition system with dual serving interfaces — REST API for programmatic use and a full web app for browser-based inference.

Key Design Decisions

  • YOLOv5s chosen for the optimal accuracy/size tradeoff — 14.4MB model fits in any container
  • Dual interface: restapi.py for developers, webapp.py for end-users — same model, two access patterns
  • Docker-first: Dockerfile at root means docker build . && docker run is the full deployment story
  • Jinja2 templates give the web app a proper UI without a frontend framework

Testing

  • tests/ directory with test-images/ for inference validation
  • requirements.txt pins all dependencies for reproducible builds

Next Steps

  • GPU inference support for faster prediction
  • Multi-image batch processing endpoint
  • Deploy to cloud with auto-scaling
✓ Insights saved locally