Indoor Scene Recognition — YOLOv5 · REST API · Web App

type

Computer Vision

status

Completed

year

2026

role

Solo Engineer

01 —

System Architecture · 3D View

02 —

Architecture Diagram

Image Input

Upload · URL · Test

↓

Request Router

Flask · Python

↓

REST API

restapi.py

Web App

webapp.py

↓

YOLOv5s Model

yolov5s.pt · 14.4MB

↓

Inference Engine

PyTorch · CUDA

↓

JSON Response

Predictions · Scores

Web UI

Jinja2 · Templates

03 —

Screenshots & Output

terminal

$ docker build -t indoor-rec . && docker run -p 5000:5000 indoor-rec

✓ Image built successfully

✓ YOLOv5s model loaded → yolov5s.pt (14.4MB)

POST /predict HTTP/1.1

→ Preprocessing image: 640x640

→ Detected: bedroom 0.91, furniture 0.87, window 0.74

→ Response: 200 OK · 0.23s

Web UI available at http://localhost:5000

API Terminal Output

Flask server inference log

Detection Scores

Bedroom91%

Kitchen87%

Bathroom83%

Living Room78%

Office74%

Detection Scores

Per-class confidence scores

Data Output

{
class: bedroom,
confidence: 0.91,
inference_time_ms: 230,
model: yolov5s.pt,
image_size: 640x640
}

API Response JSON

REST endpoint output

Project Structure

📁 indoor-recognition-main/

├─ restapi.py REST API server

├─ webapp.py Web app server

├─ yolov5s.pt 14.4MB model

├─ Dockerfile Container config

├─ templates/ Jinja2 HTML

├─ tests/ pytest suite

└─ requirements.txt Dependencies

Project Structure

Dockerized repo layout

04 —

What I Built

›

Fine-tuned YOLOv5s model (yolov5s.pt, 14.4MB) for indoor scene recognition with custom class detection.

›

Built dual-interface serving layer: REST API (restapi.py) for programmatic access and interactive web app (webapp.py) for browser-based inference.

›

Designed Jinja2-powered frontend (templates/, static/) with real-time image upload and prediction display.

›

Containerized the full stack with Docker (Dockerfile) enabling one-command deployment anywhere.

›

Structured test suite (tests/, test-images/) for model inference validation and API endpoint testing.

›

Documented architecture and usage in docs/ with requirements.txt for reproducible environment setup.

05 —

Project Insights

✎Personal Notes & Learnings

Markdown Editor

Live Preview

What I Built

A production-deployable indoor scene recognition system with dual serving interfaces — REST API for programmatic use and a full web app for browser-based inference.

Key Design Decisions

YOLOv5s chosen for the optimal accuracy/size tradeoff — 14.4MB model fits in any container
Dual interface: restapi.py for developers, webapp.py for end-users — same model, two access patterns
Docker-first: Dockerfile at root means docker build . && docker run is the full deployment story
Jinja2 templates give the web app a proper UI without a frontend framework

Testing

tests/ directory with test-images/ for inference validation
requirements.txt pins all dependencies for reproducible builds

Next Steps

GPU inference support for faster prediction
Multi-image batch processing endpoint
Deploy to cloud with auto-scaling

✓ Insights saved locally

← PreviousGlobal Harvest Imports

All Projects

Next →Twitter Analytics & Engagement Prediction