Filed under: E-mail, Productivity, Google
Why won't Gmail add filtering based on character sets?

We know billions of people speak languages which use these character encodings but, the problem is, most of us don't speak any language which uses a non-western character set. Why can't Gmail offer us the chance to filter out messages we can't read?
Cyrillic messages, like the one pictured above, always contain
charset="Windows-1251" in their headers but, Gmail lacks the ability to allow you to create filters based on header information. Gmail's spam filtering -- which for the most part works flawlessly -- often fails to catch non-english messages and the end result for a highly active Gmail user; Lots of obvious non-english spam slips through the cracks. We love you Gmail. We just wish that you'd address this tiny character flaw so we can love you all the more.
So, just how good at time waster games are you? Think you've got the stuff? Well, The World's Hardest Game 2.0 doesn't think you do.
Yes, amazingly, it's possible to have a sequel to a game called "The World's Hardest Game". It doesn't seem logically possible, since if the first one was actually the world's hardest, how could another one come along and share the moniker? It made me doubt the name in the first place. That is, until I tried the game.
The mechanics of the game are very simple. You are a small red square, ...

Reader Comments (Page 1 of 1)
kerunt said 7:56PM on 6-19-2007
So imagine this, you setup your account to filter emails containing Cyrillic characters. Then someone emails you, and for the "cool" effect writes their name with a Cyrillic character. We've all seen it: a variation of 1337 5p34k. That email would end up in spam and you will likely miss it. Sounds fun?
Reply
Nevan said 1:35AM on 6-20-2007
You could always try copying a few of those strange russian letters and make a spam filter out of them.. for example
if a subject contains п or д or ч
filter it to the spam. (I have no idea what they are, i copied them from a Russian site)
BTW, I've always found the way gmail handles Japanese text to be great, compared to other webmail.
Reply
Jhon said 5:15AM on 6-20-2007
see http://photo2text.com/
Reply
jurkis said 4:17AM on 6-20-2007
"п or д or ч" - these are just letters p, d and ch :)
Reply
Colin said 5:51AM on 6-20-2007
Mainly because, undoubtedly, someone you know has somehow set their default charset to Cryllic and you'll cry foul that suddenly you're NOT getting their messages.
Just keep flagging as spam in Gmail's interface.
Reply
Shmuel said 6:00AM on 6-20-2007
Personally I have no problem filtering out language groups I do not have any possibility of understanding as well as people who use 1337 5p34k. I find communicating in either "language" to be a completely frustrating experience not worth my time.
Reply
Atanas Boev said 6:12AM on 6-20-2007
Can't I just send unicode email, and still have the cyrillic characters pass through?
By the way, do you fitler comments, based on the character set?
Междувпрочем, филтрирате ли коментари, според майката си тракало?
Reply
deceze said 6:29AM on 6-20-2007
The trick? Click "Report Spam", not "Delete". Gmail does filter non-English mails. Train your filter, it works fine for me.
Reply
Nick said 6:52AM on 6-20-2007
I'm with deceze, I had a spell of non-english stuff hammering my inbox, but after a few "Report Spam" clicks it all seems to have gone away :D
Reply
Linh said 8:57AM on 6-20-2007
I think the bigger issue here is gmail won't let me filter via header information in general. That's a powerful tool that I have with my fastmail account I wish google would implement. Granted, I pay for my fastmail account...
Reply
Jamar said 1:13PM on 6-20-2007
Do tell, though- look at that e-mail. Since when does 268=30-52? Some teacher in Russia (or wherever else uses Cyrillic characters) isn't doing their job right.
Reply
kerunt said 1:18AM on 6-21-2007
Jamar, that's not math, that's a telephone number. There are two, "268-31-52" and "268-30-52", but for some reason, the first one has some characters messed up (that's why you see the equal sign).
Reply
Jay said 9:00AM on 6-25-2007
I can't remember the last time I saw a Cyrillic message in my Spam folder, much less my inbox. I use Yahoo Mail.
Reply