The trick with pocketsphinx is to limit the vocabulary you want to recognize, and create a corpus of the types of things you want to be able to recognize and feed it through here: http://www.speech.cs.cmu.edu/tools/lmtool-new.html
If you try to use pocketsphinx to recognize arbitrary English (e.g. dictation) it's not going to work very well in my experience.
Two realistic options, one is pocketsphinx, the other Kaldi. When running on a Pi, pocketsphinx will be your only realistic option for realtime detection. You'll want to move to a RaspPi 3 as well, and you'll want to use a customized dictionary to try and get your recognition speed up. Lastly, there are several parameters you can tweak that'll affect recognition speed.
Raw processing power will be the bottleneck on a Raspberry Pi.