What version of grep are you using? Not too long ago the grep in debian/unstable was awful with UTF strings. pg135.txt is project gutenberg's les mis. What were your times?
I no longer notice this behavior:
dfc@motherjones:~$ grep --version
grep (GNU grep) 2.9
dfc@motherjones:~$ time LANG=C grep asdf < pg135.txt > /dev/null
real 0m0.017s
user 0m0.008s
sys 0m0.004s
dfc@motherjones:~$ time LANG=UTF8 grep asdf < pg135.txt > /dev/null
real 0m0.017s
user 0m0.012s
sys 0m0.004s
dfc@motherjones:~$ time LANG=en_us.UTF8 grep asdf < pg135.txt > /dev/null
real 0m0.012s
user 0m0.004s
sys 0m0.004s
There is not a lot of info about this in debian bug 604408
; grep --version
GNU grep 2.5.3
; time LANG=C grep asdf < lesms10.txt > /dev/null
real 0m0.025s
user 0m0.011s
sys 0m0.014s
; time /usr/local/plan9/bin/grep asdf < lesms10.txt > /dev/null
real 0m0.082s
user 0m0.043s
sys 0m0.013s
; time LANG=en_US.UTF-8 grep adsf < lesms10.txt > /dev/null
real 0m1.209s
user 0m0.818s
sys 0m0.018s
Those are the only two grep implementations I have handy. GNU grep 2.6.3 takes the same amount of time searching for 'asdf' in both locales, but searching for '.' is still slow. Thanks for pointing that out.
I no longer notice this behavior:
There is not a lot of info about this in debian bug 604408http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=604408
If memory serves me correctly this upstream fixed this sometime after 2.7.1 or 2.7.3
Funner fact: GNU grep used to be slow with UTF.