View on GitHub

ASR_command_words

Automatic Speech Recognition of command words with CNN and RNN-LSTM

Project overview

You can access the code HERE

Objective:

Speech recognition and classification of 10 command words ((1) links (2) rechts (3) vor (4) zurück (5) start (6) stop (7) schneller (8) langsamer (9) drehung links (10) drehung rechts) using Long Short Term Memory (LSTM) and Convolutional Neural Networks (CNN). The system could be used on a higher level for the control of a robot via voice.

Part 1: Data acquisition

Part 2: Preprocessing

Part 3: Training of LSTM and CNN model for classification

CNN Architecture LSTM Architecture

CNN training curves LSTM training curves

Part 4: Results and Evaluation

Average values of Precision, Recall and F1 score and standard deviation from the F1 score of the CNN model (validation data, K=3) Average values of Precision, Recall and F1 score and standard deviation from the F1 score of the LSTM model (validation data, K=3)
Confusion matrix CNN model on the validation datasets (average) Confusion matrix LSTM model on the validation datasets (average)