Updates RSS

Calculating day of week and Easter date

A branchless code for calculating day of week, Easter date, and Jewish holidays.

Using DAWG for predictive text input

How to find words by their prefix, which can be mistyped, and sort them by frequency stored with the word.

Recommended books and sites

IntelĀ® Architecture Instruction Set Extensions Programming Reference (PDF reference). The description of new instructions in the upcoming Haswell processor, including transactional memory support, hardware random number generator, and 256-bit vector integer operations. The transactional memory instructions should be useful for GIL (global interpreter lock) in Python and Ruby. They tried to eliminate it with software TM, but it was too slow.

Win32 Assembly Cheat Sheet

Added Serbo-Croatian translation

Performance measurements with RDTSC

Corrected links (thanks to Joseph)

Comments RSS

Peter Kankowski:

Hi Ali, unfortunately, this article is about x86 only. My experience with ARM is limited.


Hi Do you have these statistics for ARM processors?

Peter Kankowski:

Thanks, I will test your function.

Leonid Yuriev:

It seems t1ha superior to all of the above functions, both in speed and in quality. Of course it could be used with Folly' F14 hash table. https://github.com/PositiveTechnologies/t1ha

Peter Kankowski:

Thank you, it's something I missed in the blog post.


Discussion: the first language *

Which programming language to learn first?

Recommended books and sites *

The minimal reading list to become a good programmer.

New links: 2011

Important news, interesting finds.

Thoughts on Windows Phone 7

Why there are no serious WP7 apps.


Hash functions: An empirical comparison *

Benchmark program for hash tables and comparison of 15 popular hash functions.

Plural forms

Spelling messages like "1 file found" or "5 files found" correctly in any language.

Two-stage tables for storing Unicode character properties

A data structure for storing the properties of Unicode characters.

Using ternary DAGs for spelling correction

A ternary structure for storing dictionaries is proposed. The structure is based on ternary search trie that is "compressed" into a DAG by linking together equal subtrees. By using it, you can eliminate affix stripping and implement a faster spelling corrector.

Searching for duplicate files

Designing an effective algorithm for finding identical files.

Natural order sorting

How to sort filenames like Picture17.png.

Using DAWG for predictive text input

How to find words by their prefix, which can be mistyped, and sort them by frequency stored with the word.

Calculating day of week and Easter date

A branchless code for calculating day of week, Easter date, and Jewish holidays.

Assembly language and machine code

Win32 Assembly Cheat Sheet

One-page reference for Win32 assembly language programming.

x86 Machine Code Statistics *

Which instructions and addressing modes are used most often. What is the average instruction length.

Redundancy of x86 Machine Code

One assembly language instruction can be encoded differently in machine code. Possible applications are steganography and compiler identification.


PHP database library

Designing a database interface for web programming.

Metaprogramming with aggressive inlining

Inlined and optimized functions can be used as macros.

Software interface design tips *

How to design easy-to-use interfaces between modules of your program.

Interpreters and compilers

Introduction: Writing a simple expression evaluator

Code generator

Reverse Polish Notation and Expression Compiler

Adding variables to expression evaluator

Four-part series about creating an expressing evaluator.

Python critique

You should choose between Python 2 and 3, find a good book, and keep in mind the diverse data types.

The future of compilers

How programming languages may look in future.

Win32 programming

Self-extracting executables

Techniques for creating SFX archives and interpreters.

Enabling additional compiler warnings

How to catch more bugs with MSVC++.

Detecting access to freed memory

Improving MSVC++ free function.

Debugging buffer overflows

A technique for debugging complex parsers.

Corrections to Raymond Chen's wheel scrolling code

For one rotation of the wheel, never scroll the window by more than one page.

Dark corners in Microsoft's documentation

Hexadecimal locale IDs, saving user data on shutdown, and other bugs.

The perils of alloca function

alloca is useful for small arrays or strings, but can be dangerous if the array is larger than you expected.

Low-level code optimization

What your compiler can do for you

You should not obfuscate you code with low-level optimizations: your compiler can do them for you.

Performance measurements with RDTSC

A method for measuring performance with RDTSC instruction.

Optimized abs function

Branchless abs function for x86 processors.

Calculating standard deviation in one pass

An example of using algebraic transformations for code optimization.

Fast strlen function

Unrolling the loop in strlen function.

Using sentinel for string manipulation

A method for optimizing search functions.

Checking if the point belongs to an interval

How to replace two comparisons with one.

Implementing strcmp, strlen, and strstr using SSE 4.2 instructions *

Using new Intel Core i7 instructions to speed up string manipulation.

SSE2 optimised strlen

Optimising strlen function using SSE2 SIMD instructions.

Table-driven approach *

How to make you code shorter and easier to maintain by using arrays.

Benchmarking CRC32 and PopCnt instructions

Exploring new instructions in Core i7/i5.

Dynamic arrays in C

A new formula for linearly growing arrays.

Web programming

JavaScript as the Next Big Language

Predictions and reality.


New features and showcase examples.

Share buttons

Create a widget for sharing your content on social networks.