SQL Server Word Sound Comparison


As we learned in a previous post, you can use the SOUNDEX function to understand the sound that a word makes. You can then compare it to another word to see if the sound matches. This works okay for exact sound matches. But what if you want check to see if the word sound similar to another word? And how do you find out how similar the words are? SQL Server has you covered. In SQL Server 2005 they released the DIFFERENCE function.

When you use the SOUNDEX function in SQL Server, it returns a 4 character code that represents the sound of the word. The DIFFERENCE function compares two of these 4 character codes and returns a value between 0 and 4. The return value indicates how many of these characters in the 4 character code were the same. Here is an example:

Frog F620
Blog B420


You can see in the example above that if you compare the SOUNDEX code of the two words, only 2 characters are identical. The ‘F’ is different from the ‘B’. The ‘6’ is different from the ‘4’. But the ‘2’ and the ‘0’ are the same between the two codes. Thus… the difference is 2.

Put It All Together
Now that you understand how SOUNDEX and DIFFERENCE work, you can use them to create searches that are based on “sounding like a word”. Here is an example… remember, 0 means the words sound nothing alike and 4 means that the words sound exactly alike. In my example, I am going to match on a likeness of 3… which is sounding pretty similar.

Reference: http://msdn.microsoft.com/en-us/library/ms188753.aspx

Leave a Comment

CommentLuv badge