Homeless in Vancouver: RLO spoofing is another Internet security reverse

    1 of 6 2 of 6

      Nine years ago, at least, computer industry insiders knew how the invisible bi-directional control characters in the Unicode character set could be exploited to disguise phishing websites and malware.

      It’s nearly a decade later and surprisingly little has been done to mitigate the threat to ordinary computer users.

      These same Unicode control characters, which are now included with all computer typefaces, also form part of the basic toolkit of work-a-day Internet hackers-slash-thieves.

      Unicode bi-directional control characters are being regularly exploited to spoof filenames, Windows registry entries and perhaps even website URLs—all in order to disguise the malicious digital content used to take control of computers and steal from computer users.

      The problem isn’t actually the Unicode control characters themselves but rather the characters running the Unicode Consortium, along with the various top hardware and software companies. These are the characters who have collaborated over the last 20 years to create the poorly secured, easily compromised and weaponized, standards—like NTP serversTLS/SSL and Unicode—that have helped make the Internet the dangerous environment that it has become.

      And Microsoft—the maker of the operating system most targeted by every kind of malware attack—should be singled out for deliberately removing a simple security feature from the Home version of Windows that could easily be used to guard against bi–direction character spoofing.

      One character set to rule them all

      You may know the Unicode text character encoding standard from such historical works as the Unicode of Hammurabi, or you may be more familiar with recent works, such as the DaVinci Unicode—or you may not.

      Unicode is one of those foundational things that underlies computers and the Internet—most of us should never need to know about it. It’s a 20-year-old standardized set of over 120,000 characters allowing most of the world’s writing systems—current and historical—to be rendered digitally.

      Each typeface that a person uses on their computer contains only a subset of this huge Unicode “alphabet”.

      For example, the version of the typeface Arial, which has shipped with Microsoft Windows since 1999, is properly called Arial Unicode MS and includes 27,720 Unicode characters.

      Unicode is fundamentally a good and necessary thing, and so are the invisible bi-directional control characters.

      The so-called Unicode “bidi” control characters are not really characters at all but rather invisible signal switches, or style operators, to control the apparent flow of words. They exist to facilitate the display—separately and together—of scripts that run left-to-right, such as English, and right-to-left, such as Arabic and Hebrew.

      CharacterMeaningUnicodeUsage
      RLMRight-to-Left Mark200fActs as an Arabic character.
      LRMLeft-to-Right Mark200eActs as a Latin character.
      RLERight-to-Left Embedding202bTreats following text as embedded right-to-left.
      LRELeft-to-right embedding202aTreats following text as embedded left-to-right.
      RLORight-to-left override202eForces following characters right-to-left.
      LROLeft-to-right override202dForces following characters left-to-right.
      PDFPop Directional Format202cRestores direction previous to LRE, RLE, RLO, LRO.
      ZWJZero width joiner200dJoins leading and trailing characters.
      ZWNJZero width non joiner200cBreaks leading and trailing characters.

      It’s not clear if the bidi control characters were misused prior to 2006 but by that year at least, Unicode insiders and computer security types were well aware of the possible malicious uses to which the bidi operators—particularly the right-to-left override (RLO) character—could be put to.

      Malicious hacking goes into reverse

      The earliest reported use that I can find of the RLO control character being exploited to distribute malware is five years ago.

      In 2010, email spam of the “I Love You” variety disguised the SASFIS malware application as a text file attachment, by placing the invisible RLO character before the “T” in the filename—turning “I-LOVE-YOU-XOXTXT.EXE” into “I-LOVE-YOU-XOXEXE.TXT”.

      There were reports, beginning in April 2012, of U.K.-based activists and journalists in Bahrain being targeted by a cyber-campaign, apparently mounted by the Government of Bahrain, involving emails containing executable file attachments infected with a spyware called FinSpy. The attached file was spoofed with an RLO control character, so “gpj.1bajaR.exe” became “exe.Rajab1.jpg”.

      Also in 2012, there were reports of a “large scale infiltration of computer systems in the Middle East area” dubbed “Mahdi”, which employed the RLO spoofing.

      In July, 2013, security researches spotted a rarity—a Mac malware app called Janicab.A—disguised, using the RLO trick, as a PDF file. Because of the RLO character in the filename, a warning displayed by the Mac OS X operating system was confusingly displayed right-to-left.

      In 2013, according to Raymond Roberts of Microsoft, the Sirefef family of malware was found to be using RLO spoofing to hide evidence of its presence in the Windows registry (a database of user settings) by mimicking a setting initiated by a Google Chrome installation. (Roberts wrote a good overview of RLO filename spoofing in 2012.)

      This last summer a spam campaign was observed distributing a Windows remote access trojan (RAT) called DarkKomet. The attached file was named: “Airway_Bill 23-06-2015 Ind□cod.exe”—with an failed attempt to use a RLO character (indicated by the box) to disguise the malware as a Microsoft Word .doc file.

      Seeing is believing even if you shouldn’t believe your eyes

      The DarkKomet malware RLO-fail is indicative of the moderate difficulty involved in applying the RLO character to filenames and making it stick.

      The Windows operating system doesn’t normally allow Unicode control characters to be pasted into filenames and possible variations in font encoding from one computer and one operating system to another can play havoc with even the best laid plans of malicious hackers.

      Consider the file that I created just for this post, called:

      famous-croats-arzano‮gpj.vbs

      To craft the file name, I had to first perform a change to the Windows registry to allow the pasting of Unicode number codes into filenames. And, in order to protect the final product in transit via email over the Internet, I had to put it in a compressed archive.

      You can download the file as a zip archive if you’d like to examine it personally. Itshould be safe; Windows Defender gave it a clean bill of health. And why not, it looks just like a plain ol’ harmless JPG picture file—probably of some cats, right?.

      While I’ll warrant that it’s harmless, it isn’t a JPG image file. If you double-click on it, instead of a picture, you’ll be presented with a series of clickable dialog boxes. It’s actually a Visual Basic Script—an executable program disguised as an image file using an RLO character operator, like so:

      Before right-left-operator:

      famous-croats-arzanogpj.vbs

      After right-left-operator:

      famous-croats-arzano‮gpj.vsb

      To anyone familiar with such things, the fact that my “JPG” still displays a default Visual Basic file icon is a bit of a giveaway but, with more work, I could’ve swapped it for default JPG file icon.

      For that matter, I could’ve disguised the filename and icon to perfectly mimic a plain text file, a MP3 audio file, a PDF file, a Microsoft Excel spreadsheet, or whatever looked safe enough to convince a user to double-click it.

      And instead of hiding a harmless Visual Basic dialog box, it could’ve been covering for a truly harmful piece of malware.

      This shouldn’t be possible.

      For starters, the three-character file extension tells Windows what to do with a file. Windows “knows” EXEs are application programs and that VBSs are Visual Basic Scripts. If you just rename a “.VBS” or “.EXE” executable program file to a document file type such as “.JPG”, then Windows will blindly try to open the disguised program in the default program for viewing JPGs—and fail. The user will see an alert declaring: “Unknown file format, empty file or file not found!”

      My renamed VBS script opens properly because it isn’t renamed at all.

      The invisible Unicode RLO control character pasted between the letters “o” and “g” is telling Windows to reverse the appearance of the order of the last seven characters in the file name—but only the appearance.

      The filename only looks to people like “famous-croats-arzanosbv.jpg”. Windows reads the actual filename, which is “famous-croats-arzanogpj.vbs”.

      This shouldn’t be possible either!

      This is a nine-year-old-plus exploitable flaw in Unicode that is well known to everyone except ordinary computer users—who have been left in the dark and essentially defenseless.

      An upper and lower case of mistaken identity

      Not only do Unicode bi-directional control characters still allow spoofing of malware file names, but (perhaps more seriously) they may even allow website URL spoofing, like so:

      A malicious coder sets up a web domain pointing to a server, say: “http://www.evilmalware-site/” and then, to hold their malware-filled website, they set up a file directory with a nonsensical name like: “strela\moc.koobecaf.www\\:ptth”.

      This gives them the following website URL:

      http://www.evilmalware-site/strela\moc.koobecaf.www\\:ptth

      After a right-left-override control character is pasted at the beginning of the URL, it is unchanged so far as a computer is concerned but looks completely different to computer users:

      http://www.evilmalware-site/strela\moc.koobecaf.www\\:ptth

      It’s unclear if the RLO control character has ever actually been exploited to spoof URLs but the possibility has been talked about for years as a kind of “homograph attack”.

      Homograph attacks exploit the resemblance of some non-Latin Unicode characters to characters in the Latin alphabet, in order to create domain names and website URLs that convincing resemble well-known English-language websites.

      The word paypal in a Latin Unicode font.
      The word raural in a Cyrillic Unicode font.
      My tests RLO-ing my own blog URL ( ‮https://sqwabb.wordpress.com/‬) were inconclusive—possibly due the alias nature of the WordPress.com URLs.

      If I selected the reversed URL (from right-to-left) in the brackets, up to the end of “m” and performed a right-click “open link in new tab” in Firefox, it worked. But pasting it into the address bar in both Pale Moon/Firefox and Chrome resulted in a Google search. Admittedly, the search returned nothing but my posts, with my main blog URL as the top search result.

      Apparently all we can do is RLO with it

      The same file displayed in XP, Windows 7, Linux, and Mac OSX.

      In 2013, A Malwarebytes researcher crafted a fake Windows file, named it “anncod.exe”, added an RLO character to make it read “annexe.doc” and took for a walk through the Internet and a few different operating systems.

      Insecure Windows XP wasn’t fooled because it’s too old to support Unicode control characters in file names. Windows 7, on the other hand, displayed the spoofed filename perfectly…wrong.

      On a Linux system, the Windows file was effectively harmless but still deceptively named. In Mac OS X, the unusable file name displayed correctly because the Mac didn’t recognize the Windows encoding of the RLO character (but that’s easily fixed).

      The takeaway is that all three desktop platforms—Windows, Linux/Unix, and Mac OS X—are equally vulnerable to Unicode control character spoofing.

      All the major free web email services allow document files, such as text, music, and images to be attached to messages. But Google and Hotmail, in particular, do not allow applications (“executables”) to be directly attached to messages because of the threat from viruses. And neither service is fooled by the thin disguise of an RLO character.

      The only way to send such a spoofed application file as either a Gmail or Hotmail attachment is inside of a zipped archive.

      So you should be wary enough of email attachments from strangers as it is but if the attachment is a zip file supposedly containing nothing more than a text or image file then that is very suspicious in itself.

      Unfortunately, both AOL mail and Yahoo Mail, in keeping with their historic regard for user security, do allow executable (Windows “EXE”) files to be directly attached to email—all the more reason why I would recommend that people be downright phobic about receiving anything from Yahoo or AOL.

      A fix for Windows Pro users but the rest of us are just in a fix

      For something like the last 14 years, one way that Microsoft has set the Home version of Windows apart from the Professional and Enterprise versions has been by not including something called the “Local Security Settings console“. This is an easy-to-use graphical front-end for the fairly baffling Windows registry—the storehouse of all user settings.

      With the Local Security Settings console, it’s easy to create a security rule that watches for Unicode bi-directional control characters and warns the user if they click on one. Otherwise, it’s all but impossible.

      So, if you have any version of Windows besides a Home version, then you can probably take advantage of the simple countermeasure against the RLO control character recommended by a 2011 advisory from the the Information Technology Promotion Agency (IPA) of Japan.

      The IPA’s simple six-step instructions shows Windows users who have the “secpol.msc” how to launch it and create a new software restriction policy rule for the RLO control character that will generate a warning message if they click on a filename using RLO spoofing.

      Otherwise I have nothing much to offer people in the way of a fix. I can’t see how to create an applicable exception rule in the Windows Firewall or Windows Defender and the expedient of ripping the Local Security Settings console out of a Pro Windows to stick it in Home Windows version has been bandied about but sounds far too dodgy to even try, let alone recommend.

      I see no protection against RLO spoofing offered by any major antivirus software package and I haven’t a clue what to tell Linux and Mac users.

      Apparently two security wrongs almost make a right

      disable Microsoft’s awful “hide file extensions” feature (which is on-by-default as of Windows 8) so I should say that and if you have it enabled, it will slightly adversely affect the look of RLO spoofed filenames by striping the reversed file extension and the period but the fake extension will still be there.

      This reminds me of the original security problem that Microsoft caused by introducing this dumb feature. Back in the early 2000s, hackers quickly realized that they could name malware aplications thusly: “the-sound-of-music.mp3.exe” and Microsoft’s “wonderful” new feature would trim the file to read: “the-sound-of-music.mp3”.

      And if malware authors continue to count on Windows users hiding file extensions, it can only be "default” of Microsoft, which made extension-hiding the default beginning with Windows 8 in 2012.

      Microsoft—acting in securely, I mean active in security, going way back. 

      Stanley Q. Woodvine is a homeless resident of Vancouver who has worked in the past as an illustrator, graphic designer, and writer. Follow Stanley on Twitter at @sqwabb.

      Comments