Experimental code to improve AES performance. Got about 25% on ia32.