πŸŽ‰ Welcome to PyVerse! Start Learning Today

Facial Recognition Attendance System using OpenCV and CSV

Advanced

Real-time facial recognition with LBPH and CSV attendance logging

1) Project Overview

What it does:
This system collects face images of known people, trains a face recognizer (LBPH), and runs a real-time webcam application that recognizes people and logs attendance into a CSV file with timestamp. It avoids duplicate marking on the same day and creates a simple audit trail.

Real-world use cases:
Classroom attendance, office entry logs, event check-ins, manufacturing floor worker tracking (non-security use), and demos for biometric systems.

Technical goals:

  • Use OpenCV to detect faces and build a dataset.
  • Train an LBPH face recognizer (fast, suitable for small-to-medium deployments) using OpenCV's face module (opencv-contrib).
  • Recognize faces in real time and log attendance into a CSV.
  • Keep system simple, reproducible, and modular.

2) Key Technologies & Libraries

  • Python 3.8+
  • opencv-contrib-python (includes cv2.face) β€” face detection and LBPH recognizer
  • numpy β€” numeric arrays
  • pandas (optional, for nicer CSV handling)
  • Standard library: os, csv, time, datetime, pathlib, argparse, pickle, logging

Install dependencies:

pip install opencv-contrib-python numpy pandas

(Use opencv-contrib-python β€” the face module is part of the contrib package.)

3) Learning Outcomes

  • Face detection with Haar cascades in OpenCV.
  • Building a face dataset from webcam captures.
  • Training and using LBPH face recognizer for real-time inference.
  • Handling realtime video streams, threading considerations, and UI feedback.
  • Data engineering for logging and deduplication (CSV attendance with date constraints).
  • Practical tradeoffs: accuracy vs speed, dataset size, retraining strategy.

4) Step-by-Step Explanation

High-level steps:

  1. Project scaffold β€” create folders: dataset/, models/, logs/.
  2. Collect faces with capture_images.py β€” capture multiple images per person via webcam.
  3. Train model with train_model.py β€” read dataset/, train LBPH model, and save models/face_recognizer.yml and a labels.pkl.
  4. Run attendance app with attendance.py β€” uses webcam, detects faces, recognizes and logs to CSV logs/attendance_YYYY-MM-DD.csv.
  5. Test & iterate β€” add more faces, retrain, tune thresholds.

We'll provide three scripts: capture_images.py, train_model.py, and attendance.py.

5) Full Working and Verified Python Code

Save each script as a separate .py file in your project folder. Create subfolders dataset/models/, and logs/ or the scripts will create them automatically.

Script A β€” capture_images.py
Use this to collect images for each person.

# capture_images.py """ Capture face images for a person. Usage: python capture_images.py --name "John_Doe" --num 50 This will create dataset/John_Doe/ and save captured face crops there. """ import cv2 import os import argparse from pathlib import Path def ensure_dir(path: Path): path.mkdir(parents=True, exist_ok=True) def capture(name: str, num_samples: int = 50, camera_index: int = 0): dataset_dir = Path("dataset") / name ensure_dir(dataset_dir) # Haar Cascade for face detection (bundled with OpenCV) cascade_path = cv2.data.haarcascades + "haarcascade_frontalface_default.xml" face_cascade = cv2.CascadeClassifier(cascade_path) if face_cascade.empty(): raise RuntimeError("Failed to load Haar cascade for face detection.") cap = cv2.VideoCapture(camera_index) if not cap.isOpened(): raise RuntimeError("Could not open webcam (index {}).".format(camera_index)) count = 0 print(f"[INFO] Starting capture for '{name}'. Press 'q' to quit early.") while True: ret, frame = cap.read() if not ret: continue gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY) faces = face_cascade.detectMultiScale(gray, scaleFactor=1.2, minNeighbors=5, minSize=(60,60)) for (x, y, w, h) in faces: face = gray[y:y+h, x:x+w] face_resized = cv2.resize(face, (200, 200)) count += 1 file_path = dataset_dir / f"{name}_{count:03d}.png" cv2.imwrite(str(file_path), face_resized) # rectangle and text for feedback cv2.rectangle(frame, (x,y), (x+w, y+h), (0,255,0), 2) cv2.putText(frame, f"Saved: {count}/{num_samples}", (10,30), cv2.FONT_HERSHEY_SIMPLEX, 0.8, (0,255,0), 2) cv2.imshow("Capture - Press q to stop", frame) key = cv2.waitKey(1) & 0xFF if key == ord('q') or count >= num_samples: break cap.release() cv2.destroyAllWindows() print(f"[INFO] Done. {count} face images saved to {dataset_dir}") if __name__ == "__main__": parser = argparse.ArgumentParser(description="Capture face images for a person") parser.add_argument("--name", required=True, help="Person name (no spaces). Will be used as folder name.") parser.add_argument("--num", type=int, default=50, help="Number of face samples to capture.") parser.add_argument("--camera", type=int, default=0, help="Webcam index (default 0).") args = parser.parse_args() capture(args.name, args.num, args.camera)

How to use

  1. Run: python capture_images.py --name "John_Doe" --num 50
  2. Repeat for each person (use unique names, no spaces).

Script B β€” train_model.py
Trains LBPH recognizer from images in dataset/.

# train_model.py """ Train LBPH face recognizer with images under dataset//*.png Generates: - models/face_recognizer.yml - models/labels.pkl (mapping label_id -> name) Usage: python train_model.py """ import cv2 import os import pickle from pathlib import Path import numpy as np DATASET_DIR = Path("dataset") MODELS_DIR = Path("models") MODELS_DIR.mkdir(parents=True, exist_ok=True) def load_images_and_labels(dataset_dir: Path): label_ids = {} current_id = 0 x_train = [] # face images (grayscale numpy arrays) y_labels = [] # numeric labels for person_dir in sorted([d for d in dataset_dir.iterdir() if d.is_dir()]): label = person_dir.name if label not in label_ids: label_ids[label] = current_id current_id += 1 id_ = label_ids[label] for img_path in sorted(person_dir.glob("*.png")): img = cv2.imread(str(img_path), cv2.IMREAD_GRAYSCALE) if img is None: continue # ensure consistent size face = cv2.resize(img, (200,200)) x_train.append(face) y_labels.append(id_) return x_train, y_labels, label_ids def train_and_save(): if not DATASET_DIR.exists() or not any(DATASET_DIR.iterdir()): raise RuntimeError("Dataset directory is empty. Run capture_images.py first to collect faces.") print("[INFO] Loading images and labels...") x_train, y_labels, label_ids = load_images_and_labels(DATASET_DIR) if len(x_train) == 0: raise RuntimeError("No training images found.") print(f"[INFO] {len(x_train)} training images found for {len(label_ids)} people.") # Create LBPH recognizer (requires opencv-contrib) try: recognizer = cv2.face.LBPHFaceRecognizer_create() except AttributeError: raise RuntimeError("LBPH recognizer not found. Ensure opencv-contrib-python is installed.") # Train print("[INFO] Training LBPH recognizer...") recognizer.train(x_train, np.array(y_labels)) # Save model and labels model_path = MODELS_DIR / "face_recognizer.yml" recognizer.write(str(model_path)) labels_path = MODELS_DIR / "labels.pkl" with open(labels_path, "wb") as f: pickle.dump(label_ids, f) print(f"[INFO] Model saved to {model_path}") print(f"[INFO] Label map saved to {labels_path}") if __name__ == "__main__": train_and_save()

Notes
β€’ The label mapping saved maps name -> id. At recognition time we invert to get id -> name.

Script C β€” attendance.py
Real-time recognition and attendance logging.

# attendance.py """ Real-time facial recognition attendance app. Usage: python attendance.py --camera 0 --threshold 60 It will: - Load models/face_recognizer.yml and models/labels.pkl - Use webcam, detect faces, predict labels and confidence. - When a known face is recognized with confidence below threshold, log to logs/attendance_YYYY-MM-DD.csv - Avoid duplicate marking for same person in same day. """ import cv2 import pickle import csv from pathlib import Path from datetime import datetime, date import time import argparse import logging # Setup paths MODELS_DIR = Path("models") LOGS_DIR = Path("logs") LOGS_DIR.mkdir(parents=True, exist_ok=True) # Configure logging logging.basicConfig(level=logging.INFO, format="%(asctime)s | %(levelname)s | %(message)s") # Load model and labels def load_model_and_labels(): model_path = MODELS_DIR / "face_recognizer.yml" labels_path = MODELS_DIR / "labels.pkl" if not model_path.exists() or not labels_path.exists(): raise RuntimeError("Model or labels not found. Run train_model.py first.") try: recognizer = cv2.face.LBPHFaceRecognizer_create() recognizer.read(str(model_path)) except Exception as e: raise RuntimeError("Failed to load LBPH model. Ensure opencv-contrib-python is installed.") from e with open(labels_path, "rb") as f: label_map = pickle.load(f) # name -> id # invert map to id->name id_to_name = {v: k for k, v in label_map.items()} logging.info(f"Loaded model with {len(id_to_name)} labels.") return recognizer, id_to_name # Attendance logging helper def mark_attendance(name: str): today = date.today().isoformat() log_file = LOGS_DIR / f"attendance_{today}.csv" # Ensure header exists and avoid duplicates seen = set() if log_file.exists(): with open(log_file, newline='', encoding='utf-8') as f: reader = csv.DictReader(f) for r in reader: seen.add(r.get("name")) if name in seen: logging.info(f"Already marked today: {name}") return False now = datetime.now().strftime("%Y-%m-%d %H:%M:%S") with open(log_file, "a", newline='', encoding='utf-8') as f: writer = csv.DictWriter(f, fieldnames=["name", "timestamp"]) if f.tell() == 0: writer.writeheader() writer.writerow({"name": name, "timestamp": now}) logging.info(f"Marked attendance: {name} at {now}") return True def run_attendance(camera_index: int = 0, threshold: float = 60.0): recognizer, id_to_name = load_model_and_labels() cascade_path = cv2.data.haarcascades + "haarcascade_frontalface_default.xml" face_cascade = cv2.CascadeClassifier(cascade_path) if face_cascade.empty(): raise RuntimeError("Failed to load Haar cascade.") cap = cv2.VideoCapture(camera_index) if not cap.isOpened(): raise RuntimeError("Could not open camera index {}".format(camera_index)) font = cv2.FONT_HERSHEY_SIMPLEX last_mark_time = {} # name -> timestamp to add debounce (seconds) DEBOUNCE_SECONDS = 10 logging.info("Starting attendance. Press 'q' to quit.") while True: ret, frame = cap.read() if not ret: continue gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY) faces = face_cascade.detectMultiScale(gray, scaleFactor=1.2, minNeighbors=5, minSize=(60,60)) for (x, y, w, h) in faces: face = gray[y:y+h, x:x+w] face_resized = cv2.resize(face, (200,200)) # predict label_id, conf = recognizer.predict(face_resized) # conf: lower is better for LBPH name = id_to_name.get(label_id, "Unknown") text = f"{name} ({conf:.1f})" color = (0,255,0) if conf <= threshold else (0,0,255) cv2.rectangle(frame, (x,y), (x+w, y+h), color, 2) cv2.putText(frame, text, (x, y-10), font, 0.8, color, 2) # If confidence good, mark attendance with simple debounce if conf <= threshold and name != "Unknown": now_ts = time.time() last = last_mark_time.get(name, 0) if now_ts - last > DEBOUNCE_SECONDS: marked = mark_attendance(name) last_mark_time[name] = now_ts cv2.imshow("Attendance - Press q to quit", frame) key = cv2.waitKey(1) & 0xFF if key == ord('q'): break cap.release() cv2.destroyAllWindows() if __name__ == "__main__": parser = argparse.ArgumentParser(description="Run facial recognition attendance.") parser.add_argument("--camera", type=int, default=0, help="Webcam index (default 0).") parser.add_argument("--threshold", type=float, default=60.0, help="LBPH confidence threshold (lower is stricter).") args = parser.parse_args() run_attendance(args.camera, args.threshold)

6) Sample Output or Results

Workflow

  1. For each person: python capture_images.py --name Alice --num 50 β€¦ --name Bob etc.
  2. Train: python train_model.py β†’ produces models/face_recognizer.yml and models/labels.pkl.
  3. Run attendance: python attendance.py --camera 0 --threshold 60

Console / log samples

2025-10-28 09:00:00 | INFO | Loaded model with 2 labels.
2025-10-28 09:00:05 | INFO | Marked attendance: Alice at 2025-10-28 09:00:05
2025-10-28 09:00:20 | INFO | Already marked today: Alice

Sample CSV (logs/attendance_2025-10-28.csv)

name,timestamp
Alice,2025-10-28 09:00:05
Bob,2025-10-28 09:02:40

GUI β€” small OpenCV window shows bounding boxes and Name (confidence) overlay in real time.

7) Possible Enhancements

  1. Use modern face embeddings (FaceNet / InsightFace / face_recognition library) for higher accuracy and scalability.
  2. Add multi-camera support, stream ingestion, and multiprocessing for scale.
  3. Integrate with DB (Postgres) and web dashboards for analytics.
  4. RAG / Context: store attendance metadata (location, device ID) and support admin corrections.
  5. MFA / liveness checks (blink detection, anti-spoofing) to reduce spoofing risk.
  6. Deploy on Jetson / Raspberry Pi with optimized models for edge recognition.
  7. User management UI to add/remove people, trigger retraining automatically.
Final notes & troubleshooting
β€’ opencv-contrib: If cv2.face is missing, uninstall opencv-python and install opencv-contrib-python:
pip uninstall opencv-python -y
pip install opencv-contrib-python
β€’ Lighting and dataset quality: Recognition depends heavily on varied, good-quality images (different angles, lighting). Capture ~30–100 images per person.
β€’ Threshold tuning: Experiment with --threshold in attendance.py for your environment.
β€’ Privacy & law: Ensure compliance with local privacy and consent laws when using facial recognition.
βœ… Tested on: Python 3.10 and above
βœ… Libraries used: tkinter (built-in)
βœ… Verified: No syntax errors, full GUI functionality working (New, Open, Save, Cut, Copy, Paste, Exit)

5) Output Example

When you run the program, a window titled "Python Notepad" appears with:

  • A large text area to type notes
  • Menus: File, Edit, and Help at the top
Example usage:
1. Type some text.
2. Click File β†’ Save, name it my_notes.txt.
3. Close and re-open with File β†’ Open.
4. Use Edit β†’ Cut/Copy/Paste as needed.

πŸ’‘ A pop-up appears when you click "About" in the Help menu.

6) Extension Challenge

Make your notepad more advanced

Goal: Add one or more of these features to make your app more professional and user-friendly:

  • Add Dark Mode / Light Mode toggle
  • Add a Find & Replace tool using simpledialog
  • Add font customization (bold, italic, font size)
  • Add autosave after every few seconds using after() method

7) Summary

You've now built a fully functional desktop notepad using Python's Tkinter library.

Through this project, you've practiced:

  • GUI development
  • Event handling
  • File I/O
  • Code modularization using classes

This project bridges the gap between console applications and real desktop software β€” preparing you for more advanced GUI frameworks like PyQt or Kivy in the future.

Keep building β€” your next step could be a Text Editor with Syntax Highlighting or Auto-Save!