Friday, September 21, 2012

Unicode Quirks in Django

Python 2.x's Unicode implementation really leaves something to be desired.  The major issue often arises when you need to go between str() and unicode() types, the former is a 8-bit character while the latter is a 16-bit character.   The problem is that doing .encode('utf-8') on a unicode object is idempotent (i.e. u'\u2013t'.encode('utf-8') but doing it on a str() twice will cause Python to trigger ascii codec errors.

Here's a great introduction of troubleshooting Unicode issues:

http://collective-docs.readthedocs.org/en/latest/troubleshooting/unicode.html

There's a great PowerPoint slide about demystifying Unicode in Python, which should be required reviewing.  It's more detailed about the complexities of UTF-encoding, but it's worthwhile to review.

http://farmdev.com/talks/unicode/

One of the general rule of thumbs that you'll get from this talk is 1) decode early 2) unicode everywhere and 3) encode late.

In Django, this approach is closely followed when writing data to the database.  You usually don't need to convert your unicode objects because it's being handled at the database layer.  Assuming your SQL database is configured properly and your Django settings are set correctly, Django's database layer handles the unicode to UTF-8 conversion seamlessly.  For example, just look inside the MySQLdb Python wrapper and right before a query is executed, the entire string is encoded into the specified character set:

MySQLdb/cursors.py:

if isinstance(query, unicode):
            query = query.encode(charset)
        if args is not None:

What if you attempt to use logging.info() on Django objects?   (i.e. logging.info("%s" % User.objects.all()[0])   If you searched on Stack Overflow, you'd see a recommendation to create a __str__(self) in your Python classes that call unicode() and convert to UTF-8:

http://stackoverflow.com/questions/1307014/python-str-versus-unicode

def __str__(self):
    return unicode(self).encode('utf-8')
Django's base model definitions (django.db.models.base) also follow this convention:

  def __str__(self):
        if hasattr(self, '__unicode__'):
            return force_unicode(self).encode('utf-8')
        return '%s object' % self.__class__.__name__

Normally, Python handles string interpolations automatically by determining whether the string is unicode or str() type.  Consider these cases:

>>> print type("%s" % a)
<type 'str'>
print type("%s" % 'hey')
<type 'str'>
print type("%s" % u'hey')
<type 'unicode'>

Assuming your character set on your database is set to UTF-8, consider this example and how Python deals with string interpolations for class. Normally Python does unicode conversions automatically, but for Python classes, "%s" always means to invoke the str() function.

class A(object):

    def __init__(self):
        self.tst = u'hello'

    def __str__(self):
        if hasattr(self, '__unicode__'):
            return self.__unicode__().encode('utf-8')
        return 'hey'

    def __unicode__(self):
        return u'hey\u2013t'

>>> a = A()
>>> print "%s" % a
hey-t
>>> print "%s %s" % (a, a.tst)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 3: ordinal not in range(128)

In this failing case, the problem is that printing the A class results in printing a str() type intermixed with a.tst, which is a unicode type.  When this issue happens, you're likely to see the UnicodeDecodeError

The same problem happens when trying to attempt to declare the __unicode__() method in your Django models and attempt to print out Django objects and attributes that have Unicode characters, similar to the issues reported in this Stack Overflow article.  Because Python string interpolation will  invoke the __str__() method, you have to be careful about intermingling Django objects and Django attributes when printing or logging them.

What's the solution?  In your Django models, it actually may be useful to force returning the Unicode type in the __str__() method, assuming you also have a __unicode__() method defined.  One of the quirks of Python is that if a Unicode type is returned, the __unicode__() method will be attempted to execute.   It's somewhat counter-intuitive, but by adding this section of code, you can avoid the hazards of intermingling Django objects and attributes:

# http://www.gossamer-threads.com/lists/python/bugs/842076
def __str__(self):
    return u'%s object' % self.__class__.__name__

The recommendation is also consistent with this python-dev discussion about how to implement __str__() and __unicode__() methods:

This was added to make the transition to all Unicode in 3k easier:

. __str__() may return a string or Unicode object.
. __unicode__() must return a Unicode object.

There is no restriction on the content of the Unicode string
for __str__().

Another alternative is to prefix your logging statements with u', which will force the Python string interpolation to run __unicode__() instead of str(). But it's easy to forget this prefix, so overriding the Django base model with this __str__() has helped to avoid triggering these UnicodeDecodeException errors for me.

Tuesday, September 18, 2012

Daily Deals/Result Glider Spyware


It has surprised me the extent to which this spyware has made its way into a lot of people's browsers.  The sites dropdowndeals.com,app.mysupercheap.com and resultglider.com all seem to be the same company (see DNS trace below).   Both Mozilla and Intuit mention this spyware, so it's fairly pervasive.


If anyone finds a site which is prompting these users to download this plug-in, let me know...Mozilla mentions that a lot of travel sites are offering this browser download.  

Here's an article that talks about this company:


I also wonder if that's why we were seeing invalid crossdomain.xml requests too:


DNS records below:

Non-authoritative answer:
Address: 4.30.3.59


Non-authoritative answer:
Address: 4.30.3.140


Non-authoritative answer:
Address: 4.30.3.177

They also appear to hosting YouTube Best Video Downoader (http://www.bestvideodownloader.com) and are using a mail forwarding address in Michigan:

http://www.bbb.org/western-michigan/business-reviews/internet-services/alactro-in-grandville-mi-38141335

Sunday, September 16, 2012

E911 for ObiTalk

This article seems to provide the best instructions about how to activate 911 service dialing through your ObiTalk device:

http://www.obitalk.com/forum/index.php?topic=339.msg1766#msg1766

Basically the ObiTalk devices are by default setup to dial 911 though the 'li' (Line Port) in their default configuration.  The OutboundCallRoute provides a series of rules that are followed to determine how the call should be made (see p. 179-180 of the ObiDeviceAdminGuide.pdf document).  The default rule is listed below:

How do these rules work?  First, each rule is described in {} as OR operations.   The first rule {(<#>:911):li}, for instance, describes how dialing # or 911 will route to the line port.   The second rule dialing **0 will invoke the automated attendant.  The remaining descriptions of how these rules are described on page 180 of the manual:

If you want route 911 calls to a local 24-hour emergency line by adding a rule that instead of routes to the li port, you need to remove the 911 redirecting to line port and then add {(<911:1xxxxxxxxxx):spX} where spX is the SIP line you're using (i.e. sp1).  

One further note: the config changes should be done on your obitalk.com settings, not directly on your ObiTalk firmware.  For some reason, when you reboot, the changes will be overridden by those set on ObiTalk.com (unless you uncheck the checkbox).

Wednesday, August 29, 2012

The pernicious effects of the Comcast Protection Suite....

This mystery had been eluding me for at least 5-6 months since we started introducing JavaScript exception monitoring, but I finally was better able to understand why the Comcast Protection Suite has been causing so many problems for many our users. No, it isn't injecting its own jQuery, and no it isn't defining its own $, but rather it's trying to do anti keystroke logging....


We knew anecdotally that disabling the Comcast plug-in solved the issue, but I could never explain why the exception occurred (such as the one below). Apparently the Comcast Protection Suite installs a Browser Helper Object DLL into Internet Explorer. The DLL has just as much control over the DOM that JavaScript modules do. It apparently is also adding keyup (and perhaps keydown) event handlers to the DOM, presumably to keep somebody else from capturing your keystrokes.

Not only does it slow down overall performance on browsers, but the DLL also causes conflicts with jQuery because jQuery tries to execute the native events after executing jQuery events (i.e. running 'onclick' events after bind('click') or live('click') events). I guess there are some issues running DLL onclick events in JavaScript, since an exception gets generated in not being able to call apply() on these handlers. If you try...catch them, the problem at least gets mitigated...but this requires a change directly within jQuery 1.7.2. jQuery 1.8.0 is out, but it still has this same problem.

Also, apparently the Comcast Protection Suite software that gets passed along for a lot of new Comcast users, which is why we believe this problem is so pervasive. I'm even more astounded by Norton, which had this response after a user complained about the slower keystroke rates once this software was installed: 

http://community.norton.com/t5/Other-Norton-Products/disappearing-keystrokes-in-webform/td-p/613012 

This particular issue appears to be isolated to this specific site and is directly related to a JavaScript function the website owners have implemented to test whether the first name or last name text filed is only alpha characters.  The method employed at the website to check for alpha characters is not the standard approach, which is to check for the input as it is added onKeyUp. The standard JS best practice is to use a regular expression that checks and validates the entire field input at form submission.  Therefore we expect this to be an isolated issue.


Besides asking every user to disable the Comcast toolbar plug-in, the workaround/fix is actually quite simple.  Apparently the native onclick handler checks need to verify that apply() can be run.  Normally DLL onclick handlers return 'undefined', which therefore are missing an apply() function.  We can enforce this check within the jQuery code, for which we'll be submitting a patch soon.

32123212            // Note that this is a bare JS function and not a jQuery handler
32133213            handle = ontype && cur[ ontype ];
3214             if ( handle && jQuery.acceptData( cur ) && handle.apply( cur, data ) === false ) {
3214             // Workaround for Comcast Protection Suite bug
3215             // Apparently the handle function is 'undefined' but not really undefined...
3216             if ( handle && jQuery.acceptData( cur ) && handle.apply && handle.apply( cur, data ) === false ) {
32153217                event.preventDefault();
32163218            }
32173219        }


The jQuery bug report is here: http://bugs.jquery.com/ticket/12423

Using JavaScript line breaking in YUI Compressor

For the past 5 months of introducing JavaScript exceptions logging, there have been "Object doesn't support this property or method" or "Object doesn't support property or method 'apply' that has been elusive in trying to diagnose in Internet Explorer browsers.  The error messages never occurred in other browsers but we saw them quite often in different users.  While some of these exceptions were generated from actually calling methods on JavaScript objects didn't exist, many of the stack traces seemed to emanate directly from jQuery.

One of our challenges was to try to understand from where these exceptions within jQuery were occurring.  jQuery usually comes minified with no line breaks, so we ran our JS minifiers with the YUI Compressor with the --line-break option (i.e. --line-break 150).  Before adding this option, Internet Explorer would often report an error on line 2, which pretty much amounted to the entire jQuery code. By breaking the minified code into smaller chunks, the line numbers could then allow further information on pinpointing this exact source of conflicts:

java -jar ../external/yuicompressor-2.4.7.jar jquery-1.7.2.min.js --line-break 150 > jquery-1.7.2_yui.min.js

url: https://www.myhost.com/static/js/jquery-1.7.2_yui.min.js
line: 33
context:
(f._data(m,"events")||{})[c.type]&&f._data(m,"handle"),q&&q.apply(m,d),q=o&&m[o],q&&f.acceptData(m)&&q.apply(m,d)===!1&&c.preve
 ntDefault()
}c.type=h,!g&&!c.isDefaultPrevented()&&(!p._default||p._default.apply(e.ownerDocument,d)===!1)&&(h!=="click"||!f.nodeName(e,"a"))&&f.acceptData(e)&&o&&e[h]&&(h!=="focus"&&h!=="blur"||c.target.offsetWidth!==0)&&!f.isWindow(e)&&(n=e[o],n&&(e[o]=null),f.event.triggered=h,e[h](),f.event.triggered=b,n&&(e[o]=n));return c.result}},dispatch:function(c){c=f.event.fix(c||a.event);var d=(f._data(this,"events")||{})[c.type]||[],e=d.delegateCount,g=[].slice.call(arguments,0),h=!c.exclusive&&!c.namespace,i=f.event.special[c.type]||{},j=[],k,l,m,n,o,p,q,r,s,t,u;g[0]=c,c.delegateTarget=this;if(!i.preDispatch||i.preDispatch.call(this,c)!==!1){if(e&&(!c.button||c.type!=="click")){n=f(this),n.context=this.ownerDocument||this;for(m=c.target;m!=this;m=m.parentNode||this){if(m.disabled!==!0){p={},r=[],n[0]=m;for(k=0;ke&&j.push({elem:this,matches:d.slice(e)});fo
 r(k=0;k

url: https://www.myhost.com/static/js/jquery-1.7.2_yui.min.js
column: None
line: 34
func: filter

url: https://www.myhost.com/static/js/jquery-1.7.2_yui.min.js
column: None
line: 31
func: trigger

This stack trace helped us pinpoint the issue to the Comcast Protection Suite, since it indicated the  problem was happening directly inside the jQuery Event module.  The jQuery Event module is used to attach and trigger jQuery events, as well as to implement the event propagation path described in the W3 standard.  By issuing try/except clauses within the dispatch() code, we were able to find exactly where the exception occurred:

        for ( i = 0; i < eventPath.length && !event.isPropagationStopped(); i++ ) {

            cur = eventPath[i][0];
            event.type = eventPath[i][1];

            handle = ( jQuery._data( cur, "events" ) || {} )[ event.type ] && jQuery._data( cur, "handle" );
            if ( handle ) {
  handle.apply( cur, data );
            }
     // Note that this is a bare JS function and not a jQuery handler                                                                                            
            handle = ontype && cur[ ontype ];
            if ( handle && jQuery.acceptData( cur ) && handle.apply( cur, data ) === false ) {
         event.preventDefault();
            }
        }

In other words, what jQuery seems to do is execute all the jQuery-related events before attempting to call the native JavaScript events (i.e. jQuery 'click' events will first be executed before the native 'onclick' events get called).  Somehow the Comcast Protection Suite adds an onclick handler that appears as 'undefined' to jQuery.  The if statement passes except fails when attempt to execute the handle.apply() statement.

More on this finding on the Comcast Protection Suite in this next writeup...

Saturday, July 21, 2012

Asynchronous loading Facebook Connect...

One of the issues you might encounter in using Facebook Connect is simply the load times for this file. After all, how long does it really take to load the all.js file hosted on their CDN servers?
<script type="text/javascript" src="https://connect.facebook.net/en_US/all.js"></script>
The Facebook docs suggest the following approach for loading all.js, which leverages the HTML5 async attribute to allow other JavaScript code to be loaded and executed in parallel.  (Also notice that the fb-root <div> element is inserted before any Facebook Connect JavaScript code is executed, mostly because the code depends on the presence of such a DOM element to insert its cross-domain handling code.)
<div id="fb-root"></div>
<script>
  window.fbAsyncInit = function() {
    FB.init({
      appId      : 'YOUR_APP_ID', // App ID
      channelUrl : '//WWW.YOUR_DOMAIN.COM/channel.html', // Channel File
      status     : true, // check login status
      cookie     : true, // enable cookies to allow the server to access the session
      xfbml      : true  // parse XFBML
    });

    // Additional initialization code here
  };

  // Load the SDK Asynchronously
  (function(d){
     var js, id = 'facebook-jssdk', ref = d.getElementsByTagName('script')[0];
     if (d.getElementById(id)) {return;}
     js = d.createElement('script'); js.id = id; js.async = true;
     js.src = "//connect.facebook.net/en_US/all.js";
     ref.parentNode.insertBefore(js, ref);
   }(document));
</script>
The proposed solution in the Facebook docs work fine, but what if you have cases in your pages that depend on the use of the FB object (i.e. to check login status of a user)?  The fbAsyncInit() function is called within the Facebook Connect code after everything is setup and calls made within this function will guarantee the existence of the FB object.   If you have other JavaScript code executed in different parts of your app, you'd have to override the window.fbAsyncInit and possibly duplicate a lot of the initialization code, since there is no way to bind multiple handlers to the same event.

In addition, you might also think that putting code within the $(document).ready() handler would provide adequate time for the all.js to be fully loaded, but the browser can signal that the DOM has fully been loaded before the JavaScript itself has completely executed.  In other words, specifying JavaScript <script> tags at the top of your HTML document doesn't guarantee that the JavaScript will be executed before the rest of the DOM has been parsed by the browser.   The net effect is that you can have race conditions, in which the browser throws "'FB' is undefined" errors when you attempt to make Facebook Connect loaded.

One phenomenon I noticed is that jQuery could be consistently loaded before the fbAsyncInit() function was ever called.   This might best be explained by peeking at the internals of the all.js file, since the code is currently structured to execute 6910 lines before the window.FB object is even available. In addition, a lot of internal JavaScript objects are created too and making FB.init() calls often result in network calls to Facebook's OAuth servers to determine login state.  The implication is that we could safely assume that jQuery library was available to us and allow us to create custom events with multiple handlers:

window.fbAsyncInit = function() {
    FB.init({apiKey: [YOUR FACEBOOK APP ID]
             oauth: true,
             channelUrl : [CHANNEL_URL]
             cookie: true
            });

    if (jQuery) {
        jQuery(document).trigger('fbAsyncInit');
    }
   };

In other parts of code, we simply create custom bind handlers and attach Facebook-specific code:

   $(document).bind('fbAsyncInit', function () {
     FB.getLoginStatus(function (response) { });
   });

By using this approach, we've been able to sidestep these FB undefined errors we observed and allow different parts of our application to run FB-specific code. Facebook should probably allow multiple post-init handlers, but in the interim, you may need to adopt a similar approach to avoid such load race conditions.

Saturday, July 7, 2012

Facebook, please more transparency...

On 6/20, I filed a bug to Facebook about an issue with all Facebook Connect-enabled sites using IE7 with Flash getting a blank dialog screen after authenticating. If you disabled Adobe Flash, then there were no problems and you didn't see the fatal blank screen of white. The problem was so consistent that I really didn't think it required more information, but I still uninstalled Flash 11 and installed Flash 10 just to make sure it wasn't an issue with newer Adobe Flash versions with IE7.   (For more background about how Facebook Connect works, see this article.)

The response from the Facebook triage team? Need more info. To try to make a point, I suggested that all you had to do was deminify their all.js JavaScript code, disable the HTML5 'postmessage' handler, and then incorporate this modified all.js in your site in lieu of theirs to reproduce the behavior across all browsers, not just IE7. The irony was that I just noticed that this source code suggestion was removed in the "need more info" section of my bug post.

The section in all.js that deals with registering different cross-domain handlers has a function called isAvailable() to determine whether to use it.  For IE7 browsers (but not IE8+), this return value will evaluate to false.  If you want to leverage other browsers with better debugging tools, the trick is to force a return false for all attempts to register the 'postmessage' handler.  By disabling the HTML5 postmessage code, the JavaScript will revert to using Adobe Flash and implement a custom postMessage() function that mimics the same behavior of the HTML version.

            r.register('postmessage', (function() {
                var s = false;
                return {
                    isAvailable: function() {
                        return false;  /* INSERTED line here */
                        return !!p.postMessage.

Almost 2 weeks went by before I noticed that IE7 logins all of a sudden started working again. What was peculiar was that on 6/3/2012, I noticed that the xd_arbiter.php version got bumped:

Changed file all_deminified.js


1 /*1341364990,169916611,JIT Construction: v585470,en_US*/
1 /*1341373700,169938038,JIT Construction: v585470,en_US*/
22
33 window.FB || (function() {
44     var ES5 = function() {
578578             }
579579         });
580580         __d("XDConfig", [], {
581             "XdUrl": "connect\/xd_arbiter.php?version=8",
581             "XdUrl": "connect\/xd_arbiter.php?version=9",
582582             "Flash": {
583583                 "path": "https:\/\/connect.facebook.net\/rsrc.php\/v1\/ys\/r\/WON-TVLCpDP.swf"
584584             },

You can actually download the files to see what versions 8 and 9 did.  There was a specific change that particularly interested in me:

wget "http://static.ak.facebook.com/connect/xd_arbiter.php?version=8" -O v8.txt
wget "http://static.ak.facebook.com/connect/xd_arbiter.php?version=9" -O v9.txt
$ diff v8.txt v9.txt 
516,517c520,521
<                 var da = z.apply(u.context || a, aa);
<                 if (da) u.exports = da;
---
>                 var ea = z.apply(u.context || a, ba);
>                 if (ea) u.exports = ea;
642c646
<         "path": "https:\/\/s-static.ak.fbcdn.net\/rsrc.php\/v1\/ys\/r\/WON-TVLCpDP.swf"
---
>         "path": "https:\/\/connect.facebook.net\/rsrc.php\/v1\/ys\/r\/WON-TVLCpDP.swf"
743c747,751

Normally a change from fbcdn.net to facebook.net doesn't matter, except that the .SWF file appears to have a check to make sure the file is hosted on a connect.facebook.net site.  If it isn't then the initialization code isn't executed and the postMessage receive handler may not work correctly, which means that successful logins will just remain stuck and not proceed to the next step.  You can review the Flash code by using showmycode.com and decompile the .SWF file listed above.

            if (_local3 != "connect.facebook.net"){
                XdComm.fbTrace("XdComm is not loaded from connect.facebook.net", {swfDomain:_local3});
                if (_local3.substr(-13) == ".facebook.com"){
                    _local4 = PostMessage.extractPathAndQuery(_local2);
                    if (_local4.substr(0, 8) != "/intern/"){
                        XdComm.fbTrace("XdComm is NOT in intern mode", {swfPath:_local4});
                        return;
                    };
                    XdComm.fbTrace("XdComm is in intern mode", {swfPath:_local4});
                } else {
                    return;
                };

Someone must have recently noticed the problem and quietly fixed it.  This change may have also explained why I started noticing a bunch of security exceptions when attempting to use the Flash v10 debugger version when logging in via IE7.  There were also a bunch of debugging messages in the JavaScript console that indicated that the SWF files were being initialized correctly except that all messages being sent were not being acknowledged by the postMessage receive handler.  


I'm not sure what could have been done except to point out to Facebook exactly what needed to be fixed.  Regardless, Facebook, please be more transparent about these issues, since this incident is not the first time you've broken Facebook Connect IE7 logins and your API platform status pages rarely advertise these issues.