Saturday, June 23, 2012

Debugging Cython

The documentation posted for debugging Cython appears to be out of date. To setup Cython to generate GDB debugging symbols, you need to use the --debug-gcc parameter, which will add the extra -g2 parameter during compiles.

~/projects/lxml-2.3.3$ python setup.py build_ext --debug-gcc --inplace --cython-gdb
~/projects/lxml-2.3.3$ make
Make sure that core dumps are enabled:

~/projects/yourproject$ ulimit -c unlimited # unlimited core size dumps, set to 0 by default so no dumps

You should now run your Python script that uses the shared object (.so) file. If your Python script dumps a core file, you can then use gdb by using "gdb python core":
$ gdb python core
Program terminated with signal 11, Segmentation fault.
#0  0x00007f0c1f02203b in __pyx_module_cleanup (self=, unused=) at src/lxml/lxml.etree.c:180630
180630   Py_DECREF((PyObject *)__pyx_v_4lxml_5etree___findStylesheetByID); __pyx_v_4lxml_5etree___findStylesheetByID = 0;

You can go up the call stack and see that it's my code changes that causes the problem:

Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `python ba.py'.
Program terminated with signal 6, Aborted.
#0  0x00007f886e25c445 in raise () from /lib/x86_64-linux-gnu/libc.so.6

#4  0x00007f886d8a3d84 in __pyx_f_4lxml_5etree__tofilelikeC14N (__pyx_v_f=, __pyx_v_exclusive=1, __pyx_v_with_comments=0, __pyx_v_compression=0, 
    __pyx_v_inclusive_ns_prefixes=, __pyx_v_element=) at src/lxml/lxml.etree.c:102124
102124     PyMem_Free(__pyx_v_c_inclusive_ns_prefixes);

Upgrading to Ruby 1.9 on Ubuntu 12.04

sudo apt-get install ruby1.9.1-full
sudo update-alternatives --config ruby
sudo update-alternatives --config gem
You can also do:
sudo apt-get install ruby-rvm
sudo rvm reload
sudo rvm install 1.9.3-p where xxx is the patch level printed out

Emacs and SNDCTL_TMR

Emacs when backgrounded appears to eat up CPU. Emacs v23.1.22 seems to fix this issue (according to this document: https://bugzilla.redhat.com/show_bug.cgi?id=732157)
kill(27420, SIGHUP)                     = 0
ioctl(3, SNDCTL_TMR_TIMEBASE or TCGETS, 0x7fff980f1e20) = -1 EIO (Input/output error)
ioctl(3, SNDCTL_TMR_TIMEBASE or TCGETS, 0x7fff980f1d50) = -1 EIO (Input/output error)
ioctl(3, SNDCTL_TMR_STOP or TCSETSW, {B0 -opost -isig -icanon -echo ...}) = -1 EIO (Input/output error)
write(3, "\7", 1)                       = -1 EIO (Input/output error)
gettimeofday({1340485345, 421750}, NULL) = 0
gettimeofday({1340485345, 421817}, NULL) = 0
gettimeofday({1340485345, 421884}, NULL) = 0
rt_sigprocmask(SIG_BLOCK, [IO], [HUP TERM IO], 8) = 0
ioctl(3, FIONREAD, [0])                 = -1 EIO (Input/output error)
kill(27420, SIGHUP)                     = 0
ioctl(3, SNDCTL_TMR_TIMEBASE or TCGETS, 0x7fff980f1e20) = -1 EIO (Input/output error)
ioctl(3, SNDCTL_TMR_TIMEBASE or TCGETS, 0x7fff980f1d50) = -1 EIO (Input/output error)
ioctl(3, SNDCTL_TMR_STOP or TCSETSW, {B0 -opost -isig -icanon -echo ...}) = -1 EIO (Input/output error)
write(3, "\7", 1)                       = -1 EIO (Input/output error)
gettimeofday({1340485345, 422559}, NULL) = 0
gettimeofday({1340485345, 422627}, NULL) = 0

Saturday, June 16, 2012

Using Cython and lxml

If you've ever had a chance to use the lxml library, you'll notice that it's written in Cython. Cython allows you to build Python-like code in .pyx that are compiled into .c code and then linked a share object (.o) file that can be linked inside Python. External declarations are defined in .pxd (much like .h header files) and C typedefs are also supported.

There are a few quirks about Cython that I discovered after making improvements to the library. . If you're learning Cython, you'll also need to have some C background too since the language enables you to write hybrid Python and C code.

 * You can intermingle Python objects and C objects in your code. Objects that do not have an explicit type declared are considered Python objects, but prefixing your declarations (i.e. int, float, char) will force the variable to be C variables:

x = 1234         # Python integer 
int f = 1234     # C integer
* If you wish to declare C-type functions, you need to prefix your declarations as "cdef" (for C) and "def" for Python functions. If you want a return value associated with your C function that is a C object, you also need to define the type:
cdef int myfunc():    # returns a C integer
   return 1234

def myfunc():         # return a Python integer
   return 1
* Cython takes care of converting between Python and C objects according to this table. You can therefore move between C-strings (char) and Python string objects (bytes) just by setting the objects to each other:
# Declarations
b = 1234  
int a

# Assignment
a = b
* Cython does not have the '*' operation to dereference pointers. You need to do use p[0] instead of *p (There is no unary * operator in Cython. Instead of *p, use p[0]. Check out the section on Differences between C and Cython expressions.
int a = 1

int *my_ptr
my_ptr[0] = &a
* Cython performs casting by angle brackets instead of parenthesis:
x = <int> 1.23
* Cython does not appear to know what to do with double pointers. Therefore, if you have an array of character strings that you need to return, you need to declare a cdef function with the return type and malloc() the appropriate amount of memory. You'll also need to make sure you to free() the memory after using the data to avoid memory leaks!
cdef char **convertfunc(py_array=None):
  cdef char **c_my_double_char_ptr

  if not py_array:
    py_array = []

  num_entries = len(py_array)

  c_array = python.PyMem_Malloc(sizeof(char *) * num_entries)

  # Converting Python object to C type
  for n, p_entry in enumerate(py_array):
    c_my_double_char_ptr[n] = p_entry

  c_array[num_entries] = NULL  # last entry needs to be NULL
  return c_array
To release the memory allocated, you would need to do:
  
cdef char **my_char_ptr

my_char_ptr = <char **>convertfunc(["abc", "def"])
python.PyMem_Free(my_char_ptr)

Disclaimer: some of the examples above have been created to help illustrate a concept.  If you find a typo, please send a note to correct it!

Also, if you want to recompile the lxml library, make sure you pip install Cython. Otherwise, lxml will
use the pre-generated .C files that came with the package when type "make".

Sunday, June 10, 2012

Facebook's cross-domain handler upgrades..

If you've read any postings about Facebook's cross-domain handling code, Facebook has gone to great lengths to implement support for it between most modern browsers.  The main use case is Facebook's Login Social Plugin, which opens a popup window that prompts to fill out a form that will prompt you for your email/password:




This popup needs to communicate with the parent window to send back information about what to do next upon successful login.   Since the parent window is most likely a page hosted on your site and the popup is a window hosted from facebook.com, there has to be a way for a successful login to pass a message between windows hosted on different sites.  Usually browser same-origin security policies will prevent such data exchanges, but there are legitimate use cases to allow messages to pass between windows, which is why the HTML5 postMessage function was introduced.

Older browsers such as Internet Explorer 7 don't have support for HTML5 postMessage, so an alternate way using Adobe Flash was used instead.   The idea was to rely on ActiveX objects to implement an equivalent postMessage functionality in Flash.  If you're curious about the internals of how it was done in Flash, you can read here for more information.

The Facebook Javascript SDK has once again gone through a significant number of changes in the last month or so, and while Facebook doesn't openly discuss internals about its code, it's helpful to know in case since some of the revisions actually directly impact how your site may be functioning.  Some of the changes include the following:

1. A lot of the cross-domain code has been completely rewritten.  IE8 and IE9 now appear to be both using the HTML5 postMessage feature natively.  There was one limitation in IE8 in that HTML postMessage could only be used within an IFrame, so Facebook has gone ahead and implemented the suggested workaround.  As a result, IE8 no longer depends on trying to use Adobe Flash and falling back to using basic iFrame message passing if Flash is not available.  IE8 and IE9 now will use its native HTML5 postMessage routine.

2. You may have noticed that the cross-domain code is no longer called xd_proxy.php.  It is now called xd_arbiter.php.  The purpose of both of these files is still the same, which is to send messages to the parent window.  The xd_proxy.php file used to include 3 different callback functions that were embedded as query strings depending on whether there was a successful login: ok_session, no_session, and no_user. You used to be able see all these query strings in the popup window. Now you can't.

Now these 3 different states are no longer shown and embedded as FORM values inside the popup window, depending on the user interaction.  For instance, there is now a cancel_url FORM input value that will close the popup window and return control back to the original frame.

http://static.ak.facebook.com/connect/xd_arbiter.php?version=6#cb=ff5d002c57c23a&origin=http%3A%2F%2F[MYHOST.COM]%2Ff1e938e69c3d116

The most important information passed is the callback (cb) querystring parameter, which contains a global unique identifier (guid) that corresponds to a key in the FB._callbacks dictionary that is used to load the code that is loaded. This callback function should correspond to a key in the parent window. If you close that parent window, you may notice a blank dialog permissions.request screen if you were to login.

3. Within the fb-root tag, there is now an iFrame with the ID fb_xdm_frame_http or fb_xdm_frame_https (depending on the protocol) that will implement the cross-domain receivers, whether they implement via postMessage, Flash, or IFrame's.  You can also figure out which transport mechanism by looking at the channel= query parameter.

<iframe id="fb_xdm_frame_https" name="fb_xdm_frame_https" src="https://s-static.ak.facebook.com/connect/xd_arbiter.php?version=6#channel=f38dfabf84&amp;origin=https%3A%2F%2F[MY_APP_HOSTNAME]&amp;channel_path=%2F%3Ffb_xd_fragment%23xd_sig%3Df3b4c64464%26&amp;transport=postmessage"></iframe>

If you want to better understand how all the cross-domain messages get passed, I've found it useful to deminify the all.js code, substitute the script include, and then putting a breakpoint on the xdRecv handler.  This allows you to see how Facebook decodes the query string parameters in xd_arbiter.php, extracts the callback function guid, and executes the code to set the appropriate cookies on your browser with the signed token response.

In summary, the big win here is that IE8 now uses the native HTML postMessage and therefore no longer depends on the Adobe Shockwave/Flash Add-On in your browser.   You can also uncover which transport mechanism within the iFrame's stored inside the fb-root div tag that you use.

Saturday, June 9, 2012

Walking up the Facebook login stack..

Put a breakpoint on the xdRecv handler inside all.js and you can see how the query

 1. Send the message to the parent window.
y.send(ca,x,parent,v);

ca = parameters specified in the initial window.open()
x = domainname
parent = parent window
v = 

2. Uses PosttMessage to send data.
v.postmessage('_FB_' + w + t, u)

3. Post-receiver receives the message.
  s.onMessage(v, w);

4. Decodes the query string (see QueryString decode() function) and converts into an object. Calls the ja(ga); function.
               h(function() {
                    if (typeof ga == 'string') if (ga.substring(0, 1) == '{') {
                        try {
                            ga = ES5('JSON', 'parse', false, ga);
                        } catch (ia) {
                            m.warn('Failed to decode %s as JSON', ga);
                            return;
                        }
                    } else ga = n.decode(ga);
                    if (!ha) if (ga.xd_sig == u) ha = ga.xd_origin;
                    if (ga.xd_action) {
                        ba(ga, ha);
                        return;
                    }
                    if (ga.access_token) k._https = /^https/.test(w);
                    if (ga.cb) {
                        var ja = k.XD._callbacks[ga.cb];
                        if (!k.XD._forever[ga.cb]) delete k.XD._callbacks[ga.cb];
                        if (ja) ja(ga);
                    }
                });
            }

Thursday, June 7, 2012

Facebook permissions.request

After you do an /oauth/dialog request, a bunch of other data is passed in the permissions request:

https://www.facebook.com/dialog/permissions.request?_path=permissions.request&app_id=[MY_APP_ID]&redirect_uri=http%3A%2F%2Fstatic.ak.facebook.com%2Fconnect%2Fxd_arbiter.php%3Fversion%3D6%23cb%3Dff5d002c57c23a%26origin%3Dhttp%253A%252F%252F[MYHOST.COM]%252Ff1e938e69c3d116%26domain%3D[MYHOST.COM]%26relation%3Dopener%26frame%3Df10f71bf23f26f8&sdk=joey&display=popup&response_type=token%2Csigned_request&domain=[MYHOST.COM]&fbconnect=1&from_login=1&client_id=132581756764290

The data that comes back (assuming you're using Chrome) comes in a JavaScript encoded code.  Note that either function d() or function c() is executed depending on the browser/user-agent string...


var message = "cb=ff5d002c57c23a&origin=http\u00253A\u00252F\u00252F[MYHOST.COM]\u00252Ff1e938e69c3d116&domain=[MYHOST.COM]&relation=opener&frame=f10f71bf23f26f8&access_token=[ACCESS TOKEN]&expires_in=0&signed_request=[SIGNED_REQUEST]&base_domain=[MYHOST.COM]",
    origin = "http:\/\/[MYHOST.COM}\/f1e938e69c3d116";
document.domain = 'facebook.com';
(function () {
    var a = window.opener || window.parent,
        b = 'fb_xdm_frame_' + location.protocol.replace(':', '');

    function c() {
        try {
            a.frames[b].proxyMessage(message);
        } catch (e) {
            setTimeout(c, 100);
        }
    }
    function d() {
        __fbNative.postMessage(message, origin);
    }
    if (window === top && /FBAN\/\w+;/i.test(navigator.userAgent)) {
        if (window.__fbNative && __fbNative.postMessage) {
            d();
        } else window.addEventListener('fbNativeReady', d);
    } else c();
})();

Saturday, June 2, 2012

Installing PHP editing support in Emacs v23

Apparently you have to install this package for PHP editing in Emacs to work:
sudo apt-get install php-elisp