抱歉,您的浏览器无法访问本站
本页面需要浏览器支持(启用)JavaScript
了解详情 >

本项目基于STM32F746实现,鸟叫声音分类作为一种常见的环境音分类任务,也非常适合用于嵌入式AI应用的探索,并且在生态研究、鸟类保护、生物多样性监测都具有重要的现实意义。通过将鸟叫声音分类算法和模型压缩到小型设备中,可以将这些功能带到更多的场景和应用中,例如将鸟叫声音分类技术应用于智能鸟窝监控系统、无人机巡航监测系统等领域,用于评估生态系统的健康状态以及监测气候变化,也可以可以对鸟类的分布情况、迁徙路径、栖息地利用等进行监测和研究。

数据集

https://xeno-canto.org/ 这是一个致力于分享来自世界各地的鸟声的网站

原始下载

  • 大小: 51.4 GB(55,284,289,304字节)

  • 占用空间: 51.5 GB(55,297,847,296字节)

  • 包含: 6,507个文件

  • 你可以通过本段Python代码下载原始数据集:

      

训练集选择

鸟类具有很高的种间差异,我们选择的是四川省内及其附近的8种鸟类进行训练。

{
    "Locustella": {
        "sp": "chengi",
        "ssp": "",
        "en": "Sichuan Bush Warbler"
    },
    "Certhia": {
        "sp": "tianquanensis",
        "ssp": "",
        "en": "Sichuan Treecreeper"
    },
    "Anser": {
        "sp": "albifrons",
        "ssp": "frontalis",
        "en": "Greater White-fronted Goose"
    },
    "Tragopan": {
        "sp": "caboti",
        "ssp": "",
        "en": "Cabots Tragopan"
    },    
    "Chrysolophus": {
        "sp": "amherstiae",
        "ssp": "",
        "en": "Lady Amhersts Pheasant"
    },
    "Tetraogallus": {
        "sp": "himalayensis",
        "ssp": "koslowi",
        "en": "Himalayan Snowcock"
    },    
    "Bambusicola": {
        "sp": "thoracicus",
        "ssp": "",
        "en": "Chinese Bamboo Partridge"
    },
    "Arborophila": {
        "sp": "brunneopectus",
        "ssp": "",
        "en": "Bar-backed Partridge"
    }
}

数据预处理

先将数据分割为1000ms的训练样本,然后通过梅尔滤波器提取特征

神经网络训练

神经网络结构

  • Input layer (3,168 features)
  • Reshape layer (32 columns)
  • 1D conv / pool layer (16 neurons, 3 kernel size, 1 layer)
  • Dropout (rate 0.3)
  • 1D conv / pool layer (32 neurons, 5 kernel size, 1 layer)
  • Dropout (rate 0.3)
  • Flatten layer
  • Dense layer (64 neurons)
  • Dropout (rate 0.3)
  • Output layer (9 classes)

训练效果

Accuracy: 93.3%
Loss: 0.51

混淆矩阵

Anser Arborophila Bambusicola Certhia Chrysolophus Locustella Tetraogallus Tragopan noise
Anser 100% 0% 0% 0% 0% 0% 0% 0% 0%
Arborophila 2.2% 92.4% 0% 0% 1.1% 1.1% 0% 1.1% 2.2%
Bambusicola 4.8% 0% 90.5% 0% 0% 0% 0% 0% 4.8%
Certhia 0% 0% 0% 88% 4% 8% 0% 0% 0%
Chrysolophus 6.3% 2.1% 2.1% 8.3% 79.2% 0% 0% 0% 2.1%
Locustella 0% 0% 0% 2.6% 5.3% 89.5% 0% 0% 2.6%
Tetraogallus 0% 0% 0% 0% 0% 0% 95.5% 0% 4.5%
Tragopan 0% 0% 0% 3.4% 0% 0% 0% 96.6% 0%
noise 0% 0% 0% 0.4% 1.2% 1.6% 0% 0% 96.7%
f1 score 0.86 0.96 0.93 0.81 0.82 0.86 0.98 0.97 0.97

性能表现

Inferencing time: 25 ms.
Peak RAM usage: 9.5K
Flash usage: 85.2K

评论