What does "挨⎒" have to do with "<html"?

By on   1 comment 200 words, read ~121 times.
Garbled text in an email.

I received this weird bit of mojibake in an email. Here's the raw text view: ------=_NextPart_001_009E_01D4D8BF.D0737E10 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable =E6=8C=A8=E2=8E=92tml xmlns:v=3D"urn:schemas-microsoft-com:vml" xmlns:o=3D"= urn:schemas-microsoft-com:office:office" xmlns:w=3D"urn:schemas-microsoft-c= om:office:word" xmlns:m=3D"http://schemas.microsoft.com/office/2004/12/omml= " xmlns=3D"http://www.w3.org/TR/REC-html40"> What's going on? 挨 is a Chinese, Japanese, Korean (cjk) unified ideograph (U+6328) ⎒ is the passive-pull-up-output symbol (U+2392) That's somehow replaced: < - less-than […]

Continue reading →

Facebook Mangles Unicode URLs

By on   350 words, read ~733 times.

Facebook rewrite URLs with Unicode in the path - this is not best practice and could be dangerous. It is possible to create a URL like http://bit.ly/😀 - the Unicode characters are valid in the path. The URL Encoded representation is : bit.ly/%F0%9F%98%80 Facebook mangles these URLs in such a way that it might be […]

Continue reading →

RTL Bugs

By on   350 words, read ~1,311 times.

Take a look at the following text, looks normal enough doesn't it? "Harry ‮".draziw a si ‭Potter Now, try to select the text and see what happens. WHAT WITCHCRAFT IS THIS?! If you examine the source code for this page, you'll see that I'm using the Unicode Bi-Directional characters. "Harry &#x202e;".draziw a si &#8237;Potter These […]

Continue reading →

Another Google Privacy Flaw - Calendar Unexpectedly Leaks Private Information (Disclosed)

By on   8 comments 700 words, read ~32,826 times.

My wife likes to set reminders for herself in Google Calendar. Recently, she added a note to her personal Google Calendar reading "Email alice@example.com to discuss pay rise" and set the date for a few months from now. She'd had a discussion with her boss, Alice, and they'd agreed to talk about salary later in […]

Continue reading →

Interesting Twitter Hashbang Bug

By on   7 comments 300 words, read ~5,234 times.

Did you know that you can to link to a specific Tweet on Twitter? The URL looks like this: https://twitter.com/#!/edent/status/197967209459499008 Pretty obviously, that's the user's name and the ID of their tweet. Simple, right? Not really, click on that link and you'll see this: That's my name in the URL bar - but the Number […]

Continue reading →

Bugs in Twitter Text Libraries

By on   5 comments 400 words, read ~186 times.

The Twitter Engineering Team have a set of text processing classes which are meant to simplify and standardise the recognition of URLs, screen names, and hashtags. Dabr makes use of them to keep in conformance with Twitter's style. One of the advantages of the text processing is that it will recognise that www.example.com is a […]

Continue reading →

␃␄