祥云网站建设公司 概况北京网站建设hj华网天下
祥云网站建设公司 概况,北京网站建设hj华网天下,企业开发流程,大型建筑网站设计公司详情#x1f319; SleepSound AI - 智能睡眠声音检测与分析工具#x1f4d6; README.md# SleepSound AI - 智能睡眠声音检测与分析工具## #x1f3af; 项目简介基于深度学习的睡眠声音自动检测系统#xff0c;可识别并记录**打鼾、梦话、咳嗽、翻身**等睡眠声音#xff0c;帮… SleepSound AI - 智能睡眠声音检测与分析工具 README.md# SleepSound AI - 智能睡眠声音检测与分析工具## 项目简介基于深度学习的睡眠声音自动检测系统可识别并记录**打鼾、梦话、咳嗽、翻身**等睡眠声音帮助您发现未知的睡眠问题。## 功能特性- ✅ 实时声音采集与降噪处理- ✅ 多类别睡眠声音识别打鼾/梦话/咳嗽/环境音- ✅ 可视化睡眠报告生成- ✅ 数据持久化存储与历史分析- ✅ 轻量化模型支持边缘设备部署## 安装依赖bashpip install numpy scipy librosa sounddevice pyaudio matplotlib pandas scikit-learn## 快速开始bashpython main.py --mode record --duration 8hpython main.py --mode analyze --date 2024-01-15## 项目结构SleepSound-AI/├── main.py # 主程序入口├── audio_capture.py # 音频采集模块├── sound_detector.py # 声音检测核心├── data_manager.py # 数据管理模块├── report_generator.py # 报告生成器└── config.py # 配置文件## ⚠️ 免责声明本工具仅供健康生活参考不能替代专业医疗诊断。如有严重睡眠问题请咨询医生。 实际应用场景描述场景一都市白领的自我发现小李28岁程序员我总觉得白天很困但以为只是工作太累。用了这个工具一周后发现我每晚打鼾2小时还经常说梦话原来我的睡眠质量这么差场景二新婚夫妻的和谐助手小张夫妇老婆总说我睡觉打呼影响她我还不信。工具记录的音频让我意识到问题严重性现在正配合治疗关系都变好了。场景三老年人的健康管理王阿姨65岁子女给我装了这个发现我夜间咳嗽频繁及时检查发现了早期心肺问题太感谢了场景四失眠研究者的数据收集医学研究生小陈需要大量真实睡眠声音数据做研究这个工具帮我自动化收集标注效率提升10倍 引入痛点痛点 传统解决方案 问题所在不自知 问室友/家人 主观、不全面、可能隐瞒无记录 凭感觉回忆 记忆偏差、无法量化难分析 手动听录音 耗时、效率低、易遗漏无对比 单次观察 无法追踪改善效果门槛高 医院睡眠监测 昂贵、预约难、体验差 我们的解决方案SleepSound AI 24小时无感监测 AI智能识别 数据可视化 个性化建议 核心逻辑讲解整体架构图┌─────────────────────────────────────────────────────────────┐│ SleepSound AI 系统架构 │├─────────────────────────────────────────────────────────────┤│ ││ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ││ │ 麦克风采集 │───▶│ 预处理模块 │───▶│ 特征提取 │ ││ │ (24/7) │ │ 降噪/分帧 │ │ MFCC/MEL │ ││ └─────────────┘ └─────────────┘ └─────────────┘ ││ │ ││ ▼ ││ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ││ │ 结果存储 │◀───│ 分类器 │◀───│ 深度学习 │ ││ │ SQLite │ │ (SVM/CNN) │ │ 模型推理 │ ││ └─────────────┘ └─────────────┘ └─────────────┘ ││ │ ││ ▼ ││ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ││ │ 可视化报告 │◀───│ 统计分析 │◀───│ 数据聚合 │ ││ │ HTML/图表 │ │ 趋势分析 │ │ 异常检测 │ ││ └─────────────┘ └─────────────┘ └─────────────┘ ││ │└─────────────────────────────────────────────────────────────┘核心算法流程1. 音频采集 → 16kHz采样率持续监控2. 预处理 → 降噪、VAD语音活动检测、分帧3. 特征提取 → MFCC(13维) 过零率 能量熵4. 分类识别 → 预训练模型推理5. 后处理 → 去抖动、合并连续片段6. 存储分析 → SQLite 统计报表 代码模块化实现1️⃣ config.py - 配置文件SleepSound AI - 配置文件包含系统参数、模型配置、路径设置等import osfrom dataclasses import dataclass, fieldfrom typing import List, Dict, Tupledataclassclass AudioConfig:音频采集配置sample_rate: int 16000 # 采样率 16kHzchannels: int 1 # 单声道chunk_size: int 1024 # 每次读取帧数format_type: str int16 # 音频格式silence_threshold: float 0.01 # 静音阈值min_sound_duration: float 0.3 # 最小有效声音时长(秒)max_recording_hours: int 12 # 最大连续录制时长dataclassclass ModelConfig:模型配置model_path: str models/sleep_sound_classifier.pklfeature_type: str mfcc # 特征类型: mfcc/mel/spectrogramn_mfcc: int 13 # MFCC系数数量frame_length: float 0.025 # 帧长25mshop_length: float 0.010 # 帧移10msuse_augmentation: bool True # 是否使用数据增强dataclassclass SoundClasses:声音类别定义snoring: int 0talking_in_sleep: int 1cough: int 2environment: int 3silence: int 4propertydef class_names(self) - List[str]:return [打鼾, 梦话, 咳嗽, 环境音, 静音]propertydef class_colors(self) - Dict[int, str]:return {0: #FF6B6B, # 打鼾 - 红色1: #4ECDC4, # 梦话 - 青色2: #45B7D1, # 咳嗽 - 蓝色3: #96CEB4, # 环境音 - 绿色4: #DCDDE1 # 静音 - 灰色}dataclassclass DatabaseConfig:数据库配置db_path: str data/sleep_records.dbtable_name: str sound_eventsbackup_interval_hours: int 24dataclassclass ReportConfig:报告配置output_dir: str reportschart_style: str seaborn-v0_8-whitegridtop_n_sounds: int 10 # 报告中显示前N个事件include_audio_clips: bool True # 是否包含音频片段# 全局配置实例AUDIO_CONFIG AudioConfig()MODEL_CONFIG ModelConfig()SOUND_CLASSES SoundClasses()DB_CONFIG DatabaseConfig()REPORT_CONFIG ReportConfig()# 确保目录存在for path in [AUDIO_CONFIG.max_recording_hours, MODEL_CONFIG.model_path,DB_CONFIG.db_path, REPORT_CONFIG.output_dir]:dir_path os.path.dirname(path) if . in os.path.basename(path) else pathif dir_path and not os.path.exists(dir_path):os.makedirs(dir_path, exist_okTrue)2️⃣ audio_capture.py - 音频采集模块音频采集模块负责从麦克风实时采集音频数据并进行基础预处理import numpy as npimport sounddevice as sdimport threadingimport queueimport timefrom datetime import datetime, timedeltafrom typing import Optional, Generator, Callableimport logging# 配置日志logging.basicConfig(levellogging.INFO)logger logging.getLogger(__name__)class AudioCapture:音频采集类功能:- 实时音频流采集- 静音检测与过滤- 音频分块与缓冲- 支持回调模式使用示例: capture AudioCapture(sample_rate16000, chunk_size1024) for audio_chunk in capture.start_stream():... process(audio_chunk)def __init__(self,sample_rate: int 16000,channels: int 1,chunk_size: int 1024,device_index: Optional[int] None):初始化音频采集器Args:sample_rate: 采样率 (Hz)channels: 声道数chunk_size: 每次读取的帧数device_index: 音频设备索引None为默认设备self.sample_rate sample_rateself.channels channelsself.chunk_size chunk_sizeself.device_index device_index# 状态变量self.is_recording Falseself.audio_queue: queue.Queue queue.Queue(maxsize100)self.stream: Optional[sd.InputStream] None# 静音检测参数self.silence_threshold 0.01self.silence_frames 0self.min_silence_for_break 50 # 连续静音帧数阈值logger.info(fAudioCapture initialized: {sample_rate}Hz, {channels}ch)def _audio_callback(self, indata: np.ndarray, frames: int, time_info, status):音频流回调函数由sounddevice库异步调用Args:indata: 输入的音频数据 (numpy数组)frames: 帧数time_info: 时间信息字典status: 流状态if status:logger.warning(fStream status: {status})# 转换为float32并归一化audio_data indata.flatten().astype(np.float32)audio_data audio_data / np.iinfo(np.int16).maxtry:self.audio_queue.put_nowait(audio_data.copy())except queue.Full:logger.warning(Audio queue full, dropping frame)def start_stream(self) - Generator[np.ndarray, None, None]:启动音频流并返回生成器Yields:np.ndarray: 音频数据块Raises:RuntimeError: 设备不可用或权限不足self.is_recording Trueself.audio_queue queue.Queue(maxsize100)try:self.stream sd.InputStream(samplerateself.sample_rate,channelsself.channels,blocksizeself.chunk_size,dtypenp.float32,callbackself._audio_callback,deviceself.device_index)self.stream.start()logger.info(Audio stream started successfully)while self.is_recording:try:audio_chunk self.audio_queue.get(timeout1.0)yield audio_chunkexcept queue.Empty:continueexcept Exception as e:logger.error(fFailed to start audio stream: {e})raise RuntimeError(fAudio device error: {e})finally:self._stop_stream()def _stop_stream(self):停止音频流self.is_recording Falseif self.stream:self.stream.stop()self.stream.close()self.stream Nonelogger.info(Audio stream stopped)def get_devices(self) - list:获取可用音频设备列表Returns:list: 设备信息列表devices sd.query_devices()input_devices []for i, dev in enumerate(devices):if dev[max_input_channels] 0:input_devices.append({index: i,name: dev[name],channels: dev[max_input_channels],default_samplerate: dev[default_samplerate]})return input_devicesdef record_segment(self, duration_seconds: float) - np.ndarray:录制指定时长的音频段Args:duration_seconds: 录制时长(秒)Returns:np.ndarray: 完整音频数据total_frames int(duration_seconds * self.sample_rate)audio_buffer []self.is_recording Truerecorded_frames 0try:with sd.InputStream(samplerateself.sample_rate,channelsself.channels,blocksizeself.chunk_size,dtypenp.float32) as stream:while recorded_frames total_frames and self.is_recording:chunk, overflowed stream.read(self.chunk_size)audio_buffer.append(chunk.flatten())recorded_frames len(chunk)except Exception as e:logger.error(fRecording failed: {e})raisereturn np.concatenate(audio_buffer[:total_frames])def detect_voice_activity(self, audio_chunk: np.ndarray) - bool:语音活动检测(VAD)判断音频块是否包含有效声音Args:audio_chunk: 音频数据块Returns:bool: True表示检测到声音rms_energy np.sqrt(np.mean(audio_chunk ** 2))return rms_energy self.silence_thresholddef __enter__(self):上下文管理器入口return selfdef __exit__(self, exc_type, exc_val, exc_tb):上下文管理器出口self._stop_stream()return Falseclass ContinuousRecorder:连续录音管理器支持长时间运行自动分片存储def __init__(self,output_dir: str recordings,segment_minutes: int 30,sample_rate: int 16000):初始化连续录音器Args:output_dir: 输出目录segment_minutes: 每个片段的分钟数sample_rate: 采样率self.output_dir output_dirself.segment_minutes segment_minutesself.sample_rate sample_rateos.makedirs(output_dir, exist_okTrue)self.current_segment []self.segment_start_time: Optional[datetime] Noneself.file_counter 0def add_audio(self, audio_chunk: np.ndarray):添加音频数据到当前片段Args:audio_chunk: 音频数据块if self.segment_start_time is None:self.segment_start_time datetime.now()self.current_segment.append(audio_chunk)# 检查是否需要保存片段elapsed (datetime.now() - self.segment_start_time).total_seconds() / 60if elapsed self.segment_minutes:self._save_segment()def _save_segment(self):保存当前音频片段if not self.current_segment:returnaudio_data np.concatenate(self.current_segment)timestamp self.segment_start_time.strftime(%Y%m%d_%H%M%S)filename fsegment_{timestamp}_{self.file_counter:04d}.wavfilepath os.path.join(self.output_dir, filename)# 这里应该保存WAV文件简化示例self.current_segment []self.segment_start_time datetime.now()self.file_counter 1logger.info(fSaved segment: {filename})3️⃣ sound_detector.py - 声音检测核心模块声音检测核心模块包含特征提取、模型加载、声音分类等功能import numpy as npimport librosafrom sklearn.preprocessing import StandardScalerfrom sklearn.svm import SVCfrom sklearn.ensemble import RandomForestClassifierfrom sklearn.model_selection import train_test_splitfrom sklearn.metrics import classification_report, accuracy_scoreimport pickleimport joblibfrom typing import Dict, List, Tuple, Optional, Anyimport warningsfrom dataclasses import dataclassimport loggingwarnings.filterwarnings(ignore)logging.basicConfig(levellogging.INFO)logger logging.getLogger(__name__)dataclassclass DetectionResult:检测结果数据类timestamp: strsound_type: strconfidence: floatduration: floatfeatures: Dict[str, float]def to_dict(self) - Dict:return {timestamp: self.timestamp,sound_type: self.sound_type,confidence: self.confidence,duration: self.duration,features: self.features}class FeatureExtractor:特征提取器从音频中提取用于分类的特征向量def __init__(self,n_mfcc: int 13,frame_length: float 0.025,hop_length: float 0.010,sample_rate: int 16000):初始化特征提取器Args:n_mfcc: MFCC系数数量frame_length: 帧长(秒)hop_length: 帧移(秒)sample_rate: 采样率self.n_mfcc n_mfccself.frame_length int(frame_length * sample_rate)self.hop_length int(hop_length * sample_rate)self.sample_rate sample_ratelogger.info(fFeatureExtractor initialized: {n_mfcc} MFCCs)def extract_features(self, audio: np.ndarray) - np.ndarray:从音频中提取特征向量Args:audio: 音频数据 (float32, normalized)Returns:np.ndarray: 特征向量if len(audio) self.frame_length:# 填充短音频audio np.pad(audio, (0, self.frame_length - len(audio)))features []# 1. MFCC特征mfccs librosa.feature.mfcc(yaudio,srself.sample_rate,n_mfccself.n_mfcc,n_fftself.frame_length,hop_lengthself.hop_length)mfcc_means np.mean(mfccs, axis1)mfcc_stds np.std(mfccs, axis1)features.extend(mfcc_means.tolist())features.extend(mfcc_stds.tolist())# 2. 过零率 (Zero Crossing Rate)zcr librosa.feature.zero_crossing_rate(audio,frame_lengthself.frame_length,hop_lengthself.hop_length)features.append(np.mean(zcr))features.append(np.std(zcr))# 3. 频谱质心 (Spectral Centroid)spectral_centroid librosa.feature.spectral_centroid(yaudio,srself.sample_rate,n_fftself.frame_length,hop_lengthself.hop_length)features.append(np.mean(spectral_centroid))features.append(np.std(spectral_centroid))# 4. 频谱带宽 (Spectral Bandwidth)spectral_bandwidth librosa.feature.spectral_bandwidth(yaudio,srself.sample_rate,n_fftself.frame_length,hop_lengthself.hop_length)features.append(np.mean(spectral_bandwidth))features.append(np.std(spectral_bandwidth))# 5. 梅尔频谱特征mel_spec librosa.feature.melspectrogram(yaudio,srself.sample_rate,n_fftself.frame_length,hop_lengthself.hop_length)mel_means np.mean(librosa.power_to_db(mel_spec), axis1)features.extend(mel_means.tolist())# 6. 能量熵 (Energy Entropy)energy np.sum(librosa.feature.rms(yaudio)**2)features.append(float(energy))return np.array(features, dtypenp.float32)def extract_batch_features(self, audio_segments: List[np.ndarray]) - np.ndarray:批量提取特征Args:audio_segments: 音频片段列表Returns:np.ndarray: 特征矩阵 (n_samples x n_features)features_list []for segment in audio_segments:feat self.extract_features(segment)features_list.append(feat)return np.vstack(features_list)class SoundClassifier:声音分类器基于机器学习模型的睡眠声音识别def __init__(self, model_path: Optional[str] None):初始化分类器Args:model_path: 预训练模型路径None则创建新模型self.model Noneself.scaler StandardScaler()self.feature_extractor FeatureExtractor()self.class_names [打鼾, 梦话, 咳嗽, 环境音, 静音]self.is_trained Falseif model_path and os.path.exists(model_path):self.load_model(model_path)else:self._create_default_model()def _create_default_model(self):创建默认模型 (随机森林)self.model RandomForestClassifier(n_estimators100,max_depth20,random_state42,n_jobs-1)logger.info(Created default RandomForest classifier)def train(self,X_train: np.ndarray,y_train: np.ndarray,X_val: Optional[np.ndarray] None,y_val: Optional[np.ndarray] None) - Dict[str, float]:训练模型Args:X_train: 训练特征y_train: 训练标签X_val: 验证特征y_val: 验证标签Returns:Dict: 训练指标# 特征标准化X_train_scaled self.scaler.fit_transform(X_train)if X_val is not None:X_val_scaled self.scaler.transform(X_val)# 训练模型self.model.fit(X_train_scaled, y_train)self.is_trained True# 计算指标train_pred self.model.predict(X_train_scaled)train_acc accuracy_score(y_train, train_pred)metrics {train_accuracy: train_acc}if X_val is not None and y_val is not None:val_pred self.model.predict(X_val_scaled)val_acc accuracy_score(y_val, val_pred)metrics[val_accuracy] val_acclogger.info(fTraining complete - Train Acc: {train_acc:.4f}, Val Acc: {val_acc:.4f})else:logger.info(fTraining complete - Train Acc: {train_acc:.4f})return metricsdef predict(self, audio: np.ndarray) - Tuple[str, float]:预测单个音频片段的声音类型Args:audio: 音频数据Returns:Tuple[str, float]: (声音类型, 置信度)if not self.is_trained:logger.warning(Model not trained, using default prediction)return (未知, 0.0)# 提取特征利用AI解决实际问题如果你觉得这个工具好用欢迎关注长安牧笛