This commit is contained in:
2026-02-12 21:00:02 -08:00
parent 77f8236347
commit 8bdbf227ca
1141 changed files with 1010880 additions and 2 deletions

View File

@@ -0,0 +1,47 @@
## Micro Speech
The staring point for doing speech recognition on an Arduino based board is TensorFlow Light For Microcontrollers with the example sketch called micro_speech!
I have adapted the MicroSpeech example from TensorFlow Lite to follow the philosophy of this framework. The example uses a Tensorflow model which can recognise the words 'yes' and 'no'. The output stream class is TfLiteAudioOutput. In the example I am using an ESP32 AudioKit board, but you can replace this with any type of processor with a microphone.
Further information can be found in the [Wiki](https://github.com/pschatzmann/arduino-audio-tools/wiki/TensorFlow-Lite---MicroSpeech).
To capture the Audio we use an INMP441 Microphone:
![INMP441](https://pschatzmann.github.io/Resources/img/inmp441.jpeg)
The INMP441 is a high-performance, low power, digital-output, omnidirectional MEMS microphone with a bottom port. The complete INMP441 solution consists of a MEMS sensor, signal conditioning, an analog-to-digital converter, anti-aliasing filters, power management, and an industry-standard 24-bit I²S interface. The I²S interface allows the INMP441 to connect directly to digital processors, such as DSPs and microcontrollers, without the need for an audio codec in the system.
## Pins
| INMP441 | ESP32
| --------| ---------------
| VDD | 3.3
| GND | GND
| SD | IN (GPIO32)
| L/R | GND
| WS | WS (GPIO15)
| SCK | BCK (GPIO14)
- SCK: Serial data clock for I²S interface
- WS: Select serial data words for the I²S interface
- L/R: Left / right channel selection
When set to low, the microphone emits signals on the left channel of the I²S frame.
When the high level is set, the microphone will send signals on the right channel.
- ExSD: Serial data output of the I²S interface
- VCC: input power 1.8V to 3.3V
- GND: Power groundHigh PSR: -75 dBFS.
### Note
The log level has been set to Info to help you to identify any problems. Please change it to AudioLogger::Warning to get the best sound quality!
## Dependencies
You need to install the following libraries:
- https://github.com/pschatzmann/arduino-audio-tools
- https://github.com/pschatzmann/tflite-micro-arduino-examples

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,63 @@
/**
* @file streams-i2s-tf.ino
* @author Phil Schatzmann
* @brief We read audio data from a I2S Microphone and send it to Tensorflow Lite to recognize the words yes and no
* @version 0.1
* @date 2022-04-07
*
* @copyright Copyright (c) 2022
*
*/
#include "AudioTools.h"
#include "AudioTools/AudioLibs/TfLiteAudioStream.h"
#include "model.h" // tensorflow model
I2SStream i2s; // Audio source
TfLiteAudioStream tfl; // Audio sink
const char* kCategoryLabels[4] = {
"silence",
"unknown",
"yes",
"no",
};
StreamCopy copier(tfl, i2s); // copy mic to tfl
int channels = 1;
int samples_per_second = 16000;
void respondToCommand(const char* found_command, uint8_t score,
bool is_new_command) {
if (is_new_command) {
char buffer[80];
sprintf(buffer, "Result: %s, score: %d, is_new: %s", found_command, score,
is_new_command ? "true" : "false");
Serial.println(buffer);
}
}
void setup() {
Serial.begin(115200);
AudioToolsLogger.begin(Serial, AudioToolsLogLevel::Warning);
// setup Audioi2s input
auto cfg = i2s.defaultConfig(RX_MODE);
cfg.channels = channels;
cfg.sample_rate = samples_per_second;
cfg.use_apll = false;
//cfg.auto_clear = true;
cfg.buffer_size = 512;
cfg.buffer_count = 16;
i2s.begin(cfg);
// Setup tensorflow output
auto tcfg = tfl.defaultConfig();
tcfg.setCategories(kCategoryLabels);
tcfg.channels = channels;
tcfg.sample_rate = samples_per_second;
tcfg.kTensorArenaSize = 10 * 1024;
tcfg.respondToCommand = respondToCommand;
tcfg.model = g_model;
tfl.begin(tcfg);
}
void loop() { copier.copy(); }