Hus to Know?

Saturday, June 2, 2012

Installing PHP editing support in Emacs v23

Apparently you have to install this package for PHP editing in Emacs to work:

sudo apt-get install php-elisp

Friday, May 18, 2012

window.JSON no longer exists

Facebook's JavaScript Connect library used to provide JSON manipulation tools attached to window.JSON, but seemed to have recently removed them:

https://github.com/rogerhu/connect-js/commit/9ea3a84f9d11d7b15f962eb977d0af8b303de259

This change has implications on Internet Explorer 7 (IE7), which doesn't have its own internal JSON library. As a result, you're going to have manually import one now!

Wednesday, May 9, 2012

Using Apache2's mod_auth_openid...

A previous writeup discussed how to implement Google Apps user-based authentication with Apache2. The advantage is that you can implement single sign-on (SSO) to your web application through Google Apps instead of relying on username/passwords generated in .htacesss files. One problem with this approach is that it requires a few custom PHP scripts and Memcached to implement.

In search of a better way, I came upon the mod_auth_openid Apache2 module, which uses OpenID to implement user authentication for Apache2. If you're curious about how to implement the Apache2 OpenID module, here's how we got things to work.

1. First, you need to make sure you've enabled OpenID Federated Login on your Google Apps domain. See this link (item #1) for more information.

2. If you're running Ubuntu v10.04, the mod_auth_openid that comes with the package is v.3, which doesn't include the AuthOpenIDAXUsername feature added in v0.6. You will need to download and compile your own version of mod_auth_openid at: http://findingscience.com/mod_auth_openid/releases.html You can also fork a copy of the source at: git://github.com/bmuller/mod_auth_openid.git

3. You'll need to have a minimum these packages to install mod_auth_openid. Without automake, you'll be unable to run the run_autogen.sh shell script that has other dependencies. Curl (v4) needs to be installed, as well as the libopkle-dev package.

sudo apt-get install automake
sudo apt-get install autotools-dev
sudo apt-get install libtool
sudo apt-get install libtidy-dev
sudo apt-get install libcurl4-openssl-dev
sudo apt-get install libopkele-dev

Your installation would then be:

./autogen.sh
./configure
make

4. The next step is to run autogen.sh, configure, and make the mod_auth_open code. Once you've successfully compiled it, you need to copy src/.libs/mod_auth_openid.so to /usr/lib/apache2/modules. Your Apache2 configuration needs to look like:

  <Location "/">
      LoadModule authopenid_module /usr/lib/apache2/modules/mod_auth_openid.so
      AuthType OpenID
      require valid-user
  
      AuthOpenIDTrusted ^https://www.google.com/accounts/o8/ud
      AuthOpenIDAXRequire email http://openid.net/schema/contact/email @yourgoogleappsdomain\.com
      AuthOpenIDSingleIdP https://www.google.com/accounts/o8/id
      AuthOpenIDAXUsername email
      AuthOpenIDSecureCookie Off  # off for now
  </Location>

The AuthOpenIDTrusted and AuthOpenIDSIngleIdp ensure that the Google Apps will be the only trusted Identity Provider (IdP). The OAuthOpenIDAXRequire allows you to retrieve the Google Apps email using the http://openid.net/schema/contact/email attribute to request the user logging in and forcing the regex to match that of of your Google Apps domain. The latter required a bit of trial and error after reading through the Getting Started documentation from Google to figure out.

There are other installation details about getting the plugin setup, but hope this gives an intro about how to take advantage of some of the newest features of the Apache2 OpenID module!

Friday, April 13, 2012

Django v1.4 sessions

We recently upgraded to Django v1.4 and noticed that the session key was no longer being captured in our logs, or at least appearing as "None" instead of the unique identifier string. Normally these session keys are implicitly created on the first reference to request.session_key in your code (assuming that you include django.contrib.sessions middleware in your INSTALLED_APPS), but we noticed no new session cookies were being created.

Inside django.contrib.sessions, normally request.session is created:

class SessionMiddleware(object):
def process_request(self, request):
    engine = import_module(settings.SESSION_ENGINE)
    session_key = request.COOKIES.get(settings.SESSION_COOKIE_NAME, None)
    request.session = engine.SessionStore(session_key)

Before Django v1.4, the request.session_key in django/backends/base.py is a property that will create a session key on the first reference:

    def _get_session_key(self):
    if self._session_key:
        return self._session_key
    else:
        self._session_key = self._get_new_session_key()
        return self._session_key

Although Django v1.4 has documentation about a new signed cookie session approach, there is no mention of this change. It turns out that the session keys are no longer be explicitly created unless it is modified directly as shown in this diff:

https://code.djangoproject.com/changeset/17155

commit 864facf5ccc0879dde7e608a3bea3f4bee1d3a57
Author: aaugustin 
Date:   Sun Nov 27 17:52:24 2011 +0000

Fixed #11555 -- Made SessionBase.session_key read-only. Cleaned up code slightly. Refs #13478.

This also removes the implicit initialization of the session key on the first access in favor of explicit initialization.

In other words, it used to be that just referencing the session_key would cause its creation. The implications are that you must explicitly initialize a session key somewhere in your middleware code. If you start noticing None session keys, chances are that you were operating under this assumption. The fix is to create one somewhere in your middleware.

If your middleware comes after django.contrib.sessions.middleware.SessionMiddleware, then you could initialize the key with the following:

if hasattr(request, 'session') and hasattr(request.session, 'session_key') and getattr(request.session, 'session_key') is None:
  logging.debug("Creating a session key since there is none..")
  request.session.create()

This bug (with a proposed fix) has been filed in: https://code.djangoproject.com/ticket/18128

Thursday, April 12, 2012

Receiving git commit notifies by email

GitHub has the ability to check commits before they're merged, but what if you wanted to receive the commit diffs by email so that you knew when they actually made them into your main branch? The Ruby script git-commit-notifier helps fill this purpose, and you can take advantage of GitHub's post-receive hooks and use a Sinatra server to receive notifications such as the following via HTTP POST requests:

So how would you integrate git-commit-notifier with GitHub? First you need to clone the branch on the host that can receive these POST requests. Because GitHub sends a post-receive hook for each merge, you can often receive multiple web requests being sent to your Sinatra server. The naive way would be to launch the script within the POST command (i.e. %x in Ruby), but using this approach requires any operation you perform to be completed before the HTTP connection is closed.

require 'rubygems'
require 'json'
require 'sinatra'

post '/' do
if params[:payload]
   push = JSON.parse(params[:payload])
 print "JSON response: #{push.inspect}"
end

   # Tried to run as a daemon, fork, but maybe just doing an exec will produce fewer dup msgs.
   system("mirror_repo.sh ")

end

The script that would perform the merge_repo.sh task would then be something of the following:

cd /home/myrepo
git fetch github
# Sends diffs between our branch and origin/master
CURRENT_HASH=`git log HEAD -1 --pretty=%H`
NEW_HASH=`git log origin/master -1 --pretty=%H`
# git merge will actually be a fast-forward
git merge origin
git log --no-merges --pretty=format:"%P %H" $CURRENT_HASH..$NEW_HASH | awk '{print "echo " $0 " refs/heads/master | git-commit-notifier /usr/local/config/git-commit-notifier.yml"}' | sh
git checkout master
exit 0

The basic approach fetches the branch (assuming it's located on 'github'), pipes the changes listed by the parent and git hash into using the git-commit-notifier script before merging the branch. Assuming the upstream branch is simply an ancestor of the current branch, the merging should be a fast-forward.

If you plan on mirroring the repository (i.e. via Gitosis), you could setup a post-receive hook that would be configured to send diffs each time they were merged.

while read oldrev newrev ref
do
if [ "$REFSPEC" == "refs/heads/master" ]; then
echo "$oldrev $newrev $ref" | git-commit-notifier  /usr/local/config/git-commit-notifier.yml
fi
done

Thursday, April 5, 2012

Tracking JavaScript exceptions with TraceKit

One of the challenges in modern browsers is trapping all the JavaScript exceptions your users might be experiencing. Internet Explorer (IE) is especially unforgiving about sending invalid JavaScript commands. If you issue $('#myid').text() on an input HTML tag causes, IE will raise a "Object doesn't support this property or method". IE also has some security restrictions that prevent you from submitting a change to an input file HTML tag, causing you to see an "Access Denied" error message but not on other browsers such as Chrome and Firefox. You wouldn't know about these issues unless you started finding ways to report these errors.

The first thing you might consider is to use window.onerror() to perform an Ajax server-side to log the message, url, and line number. This approach works, but provides limited information especially for minified JavaScipt code since the line number would always refer to first few lines of the code! Without a character number, you'd be lost to find the location of the offending code. Since the stack trace in JavaScript is lost once the window.onerror() function is called, you have to intercept the exception before it gets to this function.

An open-source alternative is to use TraceKit, which provides a cross-browser mechanism of capturing stack traces. You can then setup to start receiving stack traces similar to the following:

url: https://myhost.com/static/js/tracekit.js
line: 164
context:
              }, (stack.incomplete ? 2000 : 0));

              throw ex; // re-throw to propagate to the top level (and cause window.onerror)
      }
func:

url: https://myhost.com/static/js/my.js
column: 45
line: 1841
context:

      if ($("#mydata").length === 0) {
          $('#myid').text(title).attr('href', url);
          $('#myid2_url').text(url).attr('href', url);
func: myfunction_name


url: https://ajax.googleapis.com/ajax/libs/jquery/1.7.1/jquery.min.js
column: 4351
line: 3
context: None
func: HTMLDocument.

(You may notice that rethrow line at the top. For Internet Explorer, the exception needs to be thrown to the top of the browser. For other browsers, you'll have to live with this extraneous information.)

How does TraceKit work? Since only the error message, URL, and line number is reported once window.onerror() is called, the key to this approach is to wrap jQuery document.ready(), $.event.add, and $.ajax calls around a try/except block, collecting the stack traces, finding the code snippet from the line number/column number, and adding a reporting handler that will be called after all the stack traces have been pulled.

You can read through the rest of the code to see how the stack traces are derived for Internet Explorer, Chrome, Safari, and Firefox. In order to derive the code snippet, TraceKit actually performs an XmlHttpRequest() call (in other words, an Ajax request) to retrieve this file from your localhost. As a result, any document that's not located on your server will be printed as "HTMLDocument." in the final stack trace output. The current version on the GitHub repo also doesn't obey this restriction, so there's a pull request that will attempt to retrieve the .JS files only if they are part of the document.domain. There is also pull request to retrieve files with https:// prefixes and deal with jQuery 1.7+, which introduces an additional parameter in the $.event.add() function.

You can then setup a reporting handler to POST this data back to your server:

TraceKit.report.subscribe(function (stackInfo) {                                                                                                                                                           
$.ajax({
    url: '/error_reporting/',
    type: 'POST',
    data: {
        browserUrl: window.location.href,
        stackInfo: JSON.stringify(stackInfo)
    }
});
return true; //suppress error on client                                                                                                                                                                                              
});

The information that gets passed includes a list of stack traces with each element as a dictionary. You can decode this information and output the information with something like the following code:

def process_js_error(request):
       stack_info = request.POST.get('stackInfo', '')
       logging.debug("stackInfo: %s" % stack_info)
       try:
           stack_info = json.loads(stack_info)
       except ValueError:
           content = "Stack Info: %s\nBrowser URL: %s\nUser Agent: %s\n" % (stack_info, request.POST['browserUrl'], request.META['HTTP_USER_AGENT'])
       else:
           # Prettier formatting stack trace.                                                                                                                                                                                              
           if isinstance(stack_info['stack'], list):
               stack_trace = "\n"
               for stack_item in stack_info['stack']:
                   for key, val in stack_item.iteritems():
                       if isinstance(val, list) and val:
                           stack_trace += "%s: \n%s\n" % (key, "\n".join(val))
                       else:
                           stack_trace += "%s: %s\n" % (key, val)
                   stack_trace += "\n"

           else:
               stack_trace = stack_info['stack']

           content = """                                                                                                                                                                                                                   
Error Type: %(error_type)s                                                                                                    
Browser URL: %(browser_url)s                                                                                                                                                            
User Agent: %(user_agent)s                                                                                                                                                    
User Name: %(user_name)s

"" % {'error_type': stack_info['message'],
      'browser_url': request.POST['browserUrl'],
      'user_agent': request.META['HTTP_USER_AGENT'],
      'stack_trace': stack_trace}

There are a bunch of pull requests that you may want to consider when using TraceKit, some of which were discussed in this writeup. Remember that there are a couple of limitations mentioned when using TraceKit. First, Chrome/Firefox/Opera for the most part will output full stack frames with column numbers but Internet Explorer will only report the top stack frame and the column numbers are not always guaranteed. Furthermore, code snippets can only be made available if they are locally hosted on your site (i.e. for files hosted on CDN such as those prefixed with https://ajax.googleapis.com/) Hope you find this tool useful for tracking down your JavaScript exceptions as we did!

Tuesday, April 3, 2012

Git 1.7.4.2

Git for Ubuntu 10.04 has a major bug in that it keeps file descriptors open for every pack file. If you use git --mirror, you will easily run out of file descriptors. You may see the following error messages.

error: cannot create pipe for index-pack: Too many open files
fatal: fetch-pack: unable to fork off index-pack
error: cannot create pipe for pack-objects: Too many open files
fatal: git pack-objects failed: Too many open files
error: cannot create pipe for index-pack: Too many open files
fatal: fetch-pack: unable to fork off index-pack
error: cannot create pipe for pack-objects: Too many open files
fatal: git pack-objects failed: Too many open files

The fix was made in Git 1.7.4.2:
https://raw.github.com/gitster/git/master/Documentation/RelNotes/1.7.4.2.txt

 * We used to keep one file descriptor open for each and every packfile    that we have a mmap window on it (read: "in use"), even when for very    tiny packfiles.  We now close the file descriptor early when the entire    packfile fits inside one mmap window.

The fix is to add the repository to your /etc/apt/sources.list file at https://launchpad.net/~git-core/+archive/ppa:

https://launchpad.net/~git-core/+archive/ppa
deb http://ppa.launchpad.net/git-core/ppa/ubuntu lucid main
deb-src http://ppa.launchpad.net/git-core/ppa/ubuntu lucid main

sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys E1DF1F24
sudo apt-get update
sudo apt-cache search git
sudo apt-get install git=1:1.7.9.4-1.1~ppa1~lucid1