*****1. Introduction*****
LibriSpeech is an open-source ASR corpus. The official page is:
https://www.openslr.org/12/
User can download the corpus from above link.

Deepspeech2 is an ASR model designed by Baidu Paddlepaddle.
Google reproduced it in tensorflow framework. The tensorfow implementation
can be found by: https://github.com/tensorflow/models/tree/archive/research/deep_speech.
It uses LibriSpeech as its traning corpus.

The Deepspeech2 implemented by Google is used by ARM China as a reference model.
To verify deepspeech2 in ARM NPU, the librispeech corpus and preprocessing to it are needed.


*****2. Pre-condition*****
a. download LibriSpeech corpus. It is already in cix nfs: /swtest/NPU-Tar/LibriSpeech.
b. untar the needed corpus. Usually test, dev are needed in inference.
c. git clone google tensorflow models repo: https://github.com/tensorflow/models/tree/archive/research/deep_speech.

*****3. How to Run*****
There are two steps: 
a. generate the corpus information csv file. 
b. generate corpus npy files for each record. The npy files include audio_feature data and transcript token data.
/data/ws/AI/dataset/SLR12/LibriSpeech/test-clean is used as example. For other data, the command is similar.
a. set python environment(replace the tensorflow_models path with real information)
    cp deepspeech_datagen.py /home/anthony/workspace/git/3th/tensorflow_models/research/deep_speech
    cp deepspeech_prep.py /home/anthony/workspace/git/3th/tensorflow_models/research/deep_speech/data
b. convert flac to wav and generate csv file:
    export PYTHONPATH="$PYTHONPATH:/home/anthony/workspace/git/3th/tensorflow_models/research/deep_speech"
    python /home/anthony/workspace/git/3th/tensorflow_models/research/deep_speech/data/deepspeech_prep.py /data/ws/AI/dataset/SLR12/LibriSpeech test-clean

    Make sure there are untar test-clean folder and files udner /data/ws/AI/dataset/SLR12/LibriSpeech.
    This command will create test-clean-wav to store wav files converted from flac and generate test-clean.csv under /data/ws/AI/dataset/SLR12/LibriSpeech
c. generate npy files
    python
    import sys
    sys.path.append('/home/anthony/workspace/git/3th/tensorflow_models/research/deep_speech')
    import deepspeech_datagen
    deepspeech_datagen.deepspeech_npy('/data/ws/AI/dataset/SLR12/LibriSpeech/test-clean.csv', '/home/anthony/workspace/git/3th/tensorflow_models/research/deep_speech/data/vocabulary.txt', '/data/ws/AI/dataset/SLR12/LibriSpeech/test_clean_npy')

    The command will generate test_clean_npy folder. Under it the audio folder is used to store audio npy file and transcipt folder is use to store transcript npy file.
    

*****4. Result saved*****
The data is prepared in /swtest:
(base) anthony@anthony:/swtest/NPU/LibriSpeech$ tree -L 1
.
 dev-clean.csv
 dev_clean_npy
 dev-clean-wav
 dev-other.csv
 dev_other_npy
 dev-other-wav
 test-clean.csv
 test_clean_npy
 test-clean-wav
 test-other.csv
 test_other_npy
 test-other-wav

8 directories, 4 files
(base) anthony@anthony:/swtest/NPU/LibriSpeech$ tree -L 1 test_clean_npy
test_clean_npy
 audio
 transcript

When doing inference, only the npy files will be used.
To decoding the inference result, the vocab will be used also.
