当前位置：首页 > news >正文

2023年12月27日学习记录_加入噪声

news 2026/2/8 15:15:12

1、今日计划学习内容
2、今日学习内容
- - 1、add noise to audio clips
  - - signal to noise ratio(SNR)
    - 加入 additive white gaussian noise(AWGN)
    - 加入 real world noises
  - 2、使用kaggel上的一个小demo：CNN模型
  - - 运行时出现的问题
    - - 调整采样率时出现bug
  - 3、明确90dB下能否声纹识别
  - 4、流量预测
3、实际完成的任务

1、今日计划学习内容

明确90dB下能否进行声纹识别
流量预测模型对比学习
学习时不玩手机 🤡

开始今日学习😄
在这里插入图片描述

2、今日学习内容

1、add noise to audio clips

学习如何将噪声加入到audio data中，后续可以将不同SNR的噪声加入原始信号样本，评估不同噪声条件下的模型性能
首先读取原始audio.wav（里面是一段话：“leave my dog alone”）

import librosa
signal, sr = librosa.load(“path/to/audio.wav”)

绘制信号图：

import matplotlib.pyplot as plt
plt.plot(signal)

在这里插入图片描述

signal to noise ratio(SNR)

在这里插入图片描述
RMS是均方根
计算信号的RMS：

import numpy as np
RMS=math.sqrt(np.mean(signal**2))

$\text{dB} = 20 \times \log_{10}(\text{RMS})$

加入 additive white gaussian noise(AWGN)

how to generate AWGN

在这里插入图片描述
噪声是服从高斯分布，均值为0，标准差是 $RMS_{noise}$

noise=np.random.normal(0, STD_n, signal.shape[0])
# np.random.normal() 函数用于生成服从正态分布的随机数
# 生成一个形状与输入信号（signal）相同的数组，其中每个元素都服从均值为 0、方差为 STD_n 的正态分布。

生成的噪声图：
在这里插入图片描述
2. AWGN的频率分析
使用快速傅里叶变化来分析噪声的频率部分

X=np.fft.rfft(noise)
radius,angle=to_polar(X)

在这里插入图片描述
频率分布非常平稳，符合“白”的特征
3. 加入噪声

signal_noise = signal+noise

SNR=10dB
在这里插入图片描述
加入噪声的完整代码

#SNR in dB
#given a signal and desired SNR, this gives the required AWGN what should be added to the signal to get the desired SNR
def get_white_noise(signal,SNR) :#RMS value of signalRMS_s=math.sqrt(np.mean(signal**2))#RMS values of noiseRMS_n=math.sqrt(RMS_s**2/(pow(10,SNR/10)))#Additive white gausian noise. Thereore mean=0#Because sample length is large (typically > 40000)#we can use the population formula for standard daviation.#because mean=0 STD=RMSSTD_n=RMS_nnoise=np.random.normal(0, STD_n, signal.shape[0])return noise
#***convert complex np array to polar arrays (2 apprays; abs and angle)
def to_polar(complex_ar):return np.abs(complex_ar),np.angle(complex_ar)#**********************************
#*************add AWGN noise******
#**********************************
signal_file='/home/sleek_eagle/research/emotion/code/audio_processing/signal.wav'
signal, sr = librosa.load(signal_file)
signal=np.interp(signal, (signal.min(), signal.max()), (-1, 1))
noise=get_white_noise(signal,SNR=10)
#analyze the frequency components in the signal
X=np.fft.rfft(noise)
radius,angle=to_polar(X)
plt.plot(radius)
plt.xlabel("FFT coefficient")
plt.ylabel("Magnitude")
plt.show()
signal_noise=signal+noise
plt.plot(signal_noise)
plt.xlabel("Sample number")
plt.ylabel("Amplitude")
plt.show()

加入 real world noises

将有噪声的音频加入到原始音频中
我们需要计算原始音频的RMS和噪声音频的RMS，为了能得到规定的SNR，我们需要修改噪声的RMS值，办法就是将每个噪声元素都乘上一个常数，这样就能使得噪声的RMS值也乘上一个常数，达到需要的噪声RMS。
在这里插入图片描述
噪声音频（水流的声音）：

加入噪声的音频：
To listen to the signal and noise I used and also to the noise-added audio files that were created by adding noise to the signal, go to

#given a signal, noise (audio) and desired SNR, this gives the noise (scaled version of noise input) that gives the desired SNR
def get_noise_from_sound(signal,noise,SNR):RMS_s=math.sqrt(np.mean(signal**2))#required RMS of noiseRMS_n=math.sqrt(RMS_s**2/(pow(10,SNR/10)))#current RMS of noiseRMS_n_current=math.sqrt(np.mean(noise**2))noise=noise*(RMS_n/RMS_n_current)return noise
#**********************************
#*************add real world noise******
#**********************************signal, sr = librosa.load(signal_file)
signal=np.interp(signal, (signal.min(), signal.max()), (-1, 1))
plt.plot(signal)
plt.xlabel("Sample number")
plt.ylabel("Signal amplitude")
plt.show()noise_file='/home/sleek_eagle/research/emotion/code/audio_processing/noise.wav'
noise, sr = librosa.load(noise_file)
noise=np.interp(noise, (noise.min(), noise.max()), (-1, 1))#crop noise if its longer than signal
#for this code len(noise) shold be greater than len(signal)
#it will not work otherwise!
if(len(noise)>len(signal)):noise=noise[0:len(signal)]noise=get_noise_from_sound(signal,noise,SNR=10)signal_noise=signal+noiseprint("SNR = " + str(20*np.log10(math.sqrt(np.mean(signal**2))/math.sqrt(np.mean(noise**2)))))plt.plot(signal_noise)
plt.xlabel("Sample number")
plt.ylabel("Amplitude")
plt.show()

参考链接：
click here

2、使用kaggel上的一个小demo：CNN模型

link here

运行时出现的问题

调整采样率时出现bug

代码：

	import subprocesscommand = ("for dir in `ls -1 " + noise_path + "`; do ""for file in `ls -1 " + noise_path + "/$dir/*.wav`; do ""sample_rate=`ffprobe -hide_banner -loglevel panic -show_streams ""$file | grep sample_rate | cut -f2 -d=`; ""if [ $sample_rate -ne 16000 ]; then ""ffmpeg -hide_banner -loglevel panic -y ""-i $file -ar 16000 temp.wav; ""mv temp.wav $file; ""fi; done; done")subprocess.run(command, shell=True)

bug：

2023-12-26 10:44:38.782251: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.

作为一个纯小白，问题非常非常的多

subprocess.run是在干嘛？通过 Python 来调用 Shell 脚本
shell脚本是什么？Shell脚本是一种用于编写、执行和自动化操作系统命令和任务的脚本语言。它是一种解释性语言，常用于Unix、Linux和类Unix系统中。
subprocess.run()函数：

函数介绍：

subprocess.run(args, *, stdin=None, input=None, stdout=None, 
stderr=None, capture_output=False, shell=False, cwd=None, 
timeout=None, check=False, encoding=None, errors=None, text=None, 
env=None, universal_newlines=None)

别怕，这个run()函数很长、很长，但并不是所有都需要的，我们必要设置的只有第一项args,也就是shell命令
-args：args参数传入一个列表或者元组，如[‘ls’,‘-l’],python会自动拼接成shell命令.[第一个参数是执行的程序，其余的是参数]；也可以直接就是一个str命令行，如果如果传入的是shell命令,则需要另外添加一个参数shell=True

函数返回：class subprocess.CompletedProcess

实在是不知道怎么改这种代码了，我的选择是：换方法，直接使用别的方法实现重采样

3、明确90dB下能否声纹识别

论文：添加链接描述
在这里插入图片描述
-5dB就相当低了，感觉不大可能

4、流量预测

代码链接：LTE Cell Traffic Grow and Congestion Forecasting
没有给数据集
后续学习链接：How to Use the TimeDistributed Layer in Keras
后续学习方向：后续要保证每天一篇相关论文，先从有复现的论文读起，同时要对流量预测的模型进行学习，建模的时候学习pytorch库和keras库

3、实际完成的任务

声纹识别增加噪声的学习

明天继续加油吧！
有没有研究生学习搭子或者大佬呀呜呜呜呜

目录