GSoC Week 2

27 May 2018

Reading time ~4 minutes

Google Summer of Coding Week 2

Hello, all. 😀

This is my third blog in this category (Google Summer of Code 2018).

So, the second week is over now. Here are some important points from the 2nd week.

1. 70GB model for Mandarin

So the right way to do the installation should be:

Step 1: add the following sentence in the /etc/hosts file

180.76.189.142  cloud.dlnel.org

Step 2: input the command in the path: /mnt/rds/redhen/gallina/models

wget -O zhidao_giga.klm http://cloud.dlnel.org/filepub/?uuid=245d02bb-cd01-4ebe-b079-b97be864ec37

We fixed all the bugs and successfully downloaded the 70GB model for Mandarin on CWRU HPC on May 22nd.

2. Use Singularity to run python files

I’ve figured out how to use Singularity to run python files with the help of Prof. Steen and Prof. Michael. Here’s the steps:

CynthiaHsus-MacBook-Pro:~ suwi$ ssh sxx186@redhen1.case.edu
Last login: Sun May 27 11:09:36 2018 from 58.196.138.203

sxx186@redhen1:~$ ssh sxx186@rider.case.edu
Last login: Sun May 27 11:49:31 2018 from redhen1.cosi.cwru.edu

[sxx186@hpc3 ~]$ srun -p gpu -C gpuk40 --gres=gpu:1 --pty bash
srun: job 7813754 queued and waiting for resources
srun: job 7813754 has been allocated resources

[sxx186@gpu028t ~]$ module load singularity/2.5.1
[sxx186@gpu028t ~]$ cd /mnt/rds/redhen/gallina/Singularity/DeepSpeech2/
[sxx186@gpu028t DeepSpeech2]$ singularity exec -B `pwd`:/mnt RedHenLab-singularity_containers-master-deepspeech2-01.simg python /mnt/housing.py

3. DeepSpeech2 on PaddlePaddle

My project is based on the PaddlePaddle framework. Starting with PaddlePaddle at HPC gave me some problems.

3.1 First try: PaddlePaddle Docker appliance on the CWRU HPC.

Initially, the team planned to try the PaddlePaddle Docker appliance on the CWRU HPC. Prof. Peter gave it a try but failed because the makers of the Docker image put stuff in /root, which is not accessible in Singularity unless you have root rights on the host machine. The details are shown below.

singularity pull docker://paddlepaddle/deep_speech:latest-gpu
git clone https://github.com/PaddlePaddle/DeepSpeech.git
# I then made a copy of DeepSpeech/examples/run_infer_golden.sh called run_infer_golden_pgu.sh and edited the latter, commenting out the automatic download of the language models (because this was too slow) and setting "--use_gpu=False" because I was on a machine without GPU. If your Internet connection to China is good, you can of course have it download the models automatically.

# I then downloaded and copied the files manually.
# The downloads were actually done on xingfu (our Chinese server) and then copied over to Erlangen with scp, because it was much faster this way.

cd DeepSpeech/models/baidu_en8k

wget http://cloud.dlnel.org/filepub/?uuid=37a1c211-ec47-494c-973c-31437a10ae90

# wget has trouble naming this file, so rename it to "baidu_en8k_model.tar.gz"

tar xvf baidu_en8k_model.tar.gz

cd ..

cd lm

wget http://paddlepaddle.bj.bcebos.com/model_zoo/speech/common_crawl_00.prune01111.trie.klm

Then we can run it:

singularity shell -B ./DeepSpeech:/DeepSpeech deep_speech-latest-gpu.simg

And from in there, we get the error message:

Singularity deep_speech-latest-gpu.simg:/DeepSpeech> sh examples/baidu_en8k/run_infer_golden_pgu.sh

Traceback (most recent call last):
  File "infer.py", line 8, in <module>
    import paddle.v2 as paddle
  File "/usr/local/lib/python2.7/dist-packages/paddle/v2/__init__.py", line 25, in <module>
    from . import dataset
  File "/usr/local/lib/python2.7/dist-packages/paddle/v2/dataset/__init__.py", line 18, in <module>
    import mnist
  File "/usr/local/lib/python2.7/dist-packages/paddle/v2/dataset/mnist.py", line 20, in <module>
    import paddle.v2.dataset.common
  File "/usr/local/lib/python2.7/dist-packages/paddle/v2/dataset/common.py", line 50, in <module>
    must_mkdirs(DATA_HOME)
  File "/usr/local/lib/python2.7/dist-packages/paddle/v2/dataset/common.py", line 43, in must_mkdirs
    os.makedirs(DATA_HOME)
  File "/usr/lib/python2.7/os.py", line 150, in makedirs
    makedirs(head, mode)
  File "/usr/lib/python2.7/os.py", line 150, in makedirs
    makedirs(head, mode)
  File "/usr/lib/python2.7/os.py", line 157, in makedirs
    mkdir(name, mode)
OSError: [Errno 13] Permission denied: '/root/.cache'
Failed in inference!

3.2 Second try: Install PaddlePaddle from source

Since the Docker approach failed, we switched to install PaddlePaddle at HPC with singularity support.

What we’d like to see is PaddlePaddle and DeepSpeech2 inside a Singularity container, so that we can deploy it on different accounts and different HPCs. The method we’d like to use is Singularity Hub. Let’s see if we can make the PaddlePaddle/DeepSpeech 2 installation work in a reproducible manner on Red Hen Lab’s github account using Singularity recipes on Singularity Hub. The goal is to have a system where we can upgrade the image simply by modifying the recipe, make use of the automated build system, and deploy at will in multiple locations.

For instance, your recipe can specify “From:ubuntu:latest”. If we maintain these recipes on Red Hen Lab’s github account, we can easily keep them updated, maintain different versions for different purposes, and pull them in as needed for our HPC pipelines.

We are still working on it and hope we can succeed asap!

Reference:

Singularity Hub: a likely useful service for creating, maintaining, and distributing singularity images
The system for creating images
Using Singularity to create portable applications

4. Conclusion

This week was great, but I contributed a little due to the English Exam this week. The next week (third week) will be more about buiding Singularity images. The third week ends on June 3. Till then, happy coding and cheers. See you later.

GSoC