🔊Building a Real-Time Scream Detection System with Python and Machine Learning

Hello, DEV Community! 👋

Have you ever wondered if your computer could listen for screams and automatically send help when someone’s in distress?

Some time ago, I worked on exactly this idea as a personal project, and it taught me a lot about combining audio processing and machine learning in a real-world context.

Today, I’ll share how I built this real-time scream detection system, which:

  1. Listens live via your microphone
  2. Predicts if a scream is happening
  3. Pops up alerts and can send SMS notifications

Ready? Let’s dive in!

💡 Why Scream Detection?
Traditional security systems rely heavily on cameras or manual monitoring. But sound can be a powerful indicator of emergencies, especially a scream.

This project grew out of my curiosity to answer:

What if a machine could recognize a scream faster than a human could dial 911?

My goal was to:
🔸 Detect distress in real time
🔸 Automate alerts
🔸 Build something that could be deployed on any desktop

🛠️ What’s Under the Hood?
This project combines audio signal processing and machine learning. Here’s what I used:

Data: Two sets of audio files—screams (negative) and non-screams (positive).

Features: MFCCs (Mel Frequency Cepstral Coefficients) extracted using librosa.

Models:

_1. SVM for binary classification.

  1. MLPClassifier for more robust pattern recognition.
  2. UI: Built with Kivy to make it clean and modern.
  3. Alerts: Visual pop-ups and optional SMS messages._

🎯 How It Worked (Step by Step)
1️⃣ 🎵 Listen

The microphone continuously captures short audio snippets.

2️⃣ 🎛️ Process

  1. Each snippet is converted into MFCC features.
  2. The features are standardized with a trained scaler.

3️⃣ 🤖 Predict

  1. SVM and MLP models make predictions.
  2. If both detect a scream, the system triggers a high-risk alert.
  3. If only one model fires, it triggers a medium-risk warning.

4️⃣ ⚠️ Alert

The app displays an on-screen alert.

Optionally, it sends a text message with the user’s location.

Why I Loved Building This
✅ Real-time inference: No noticeable lag.
✅ Dual-model accuracy: Reduced false positives.
✅ Customizable dataset: You can re-train it with your audio.
✅ Nice UI: Kivy made it look polished.

Technologies Used
Python 3
scikit-learn
librosa
Kivy
PyAudio

GitHub
https://github.com/Varun-310/SCREAM

Leave a Reply