If you work with stored procedures or if you script objects in SQL Server, you have probably seen SET statements that turn on/off some crazy thing called ANSI_PADDING. If you’re like most people, you just ignore this because it works. I wanted to take a few minutes to explain what this ANSI_PADDING thing is.
What Is ANSI_PADDING?
ANSI_PADDING is an option that controls how VARCHAR and VARBINARY values are stored. If ANSI_PADDING is turned on, then SQL Server will NOT trim the trailing spaces when it inserts into a VARCHAR field. Similarly, it will NOT trim trailing nulls when it inserts into a VARBINARY field.
Setting the ANSI_PADDING only affects the spaces on inserts. It does NOT affect comparisons.
Replacing text in SQL Server is easy. SQL Server has the REPLACE function to perform this task.
SQL Server REPLACE
The REPLACE function in SQL Server has 3 parameters.
Text to search
Text to find
Text to replace with
SELECTREPLACE('Full text to search in','search','replace')ASReplacedText
The output is: Full text to replace in
In the above example, we are searching “Text to search in” for the word “search” and we are replacing it with the word “replace”. Pretty simple, right? Now let’s take it a step further.
SQL Server Case Sensitive Replace
Above we went over how to do a case insensitive replace on a string. (The REPLACE function actually uses the default collation of the input text that it is searching). To turn it in to a SQL Server case sensitive replace, we just need to add one small thing to the end. We need to change the collation of the text we are searching. Learn more about text collation here.
SELECTREPLACE('Full text to search in'COLLATESQL_Latin1_General_CP1_CS_AS,'SEARCH','replace')ASReplacedText
The output is: Full text to search in
This example shows how the “SEARCH” text was not found because we are changing the input text to search collation to SQL_Latin1_General_CP1_CS_AS (which is case sensitive). The input string has “search” in lower case and we were searching for an upper case “SEARCH”.
When it comes to ranking data in SQL Server, as of SQL Server 2005 they have built this in to the database. In 2005 the SQL Server team introduced the RANK function. The RANK function allows you to assign a number to each record in a result set based something in the data.
Let’s say that you wanted to apply a rank based on the weight of animals in your zoo (from heaviest to lightest). Elephant would get a number of 1 (because it’s the heaviest) and a tiny turtle would get the number 100 (because it’s the lightest). The number returned would start at 1 and grow for every record. This is done using this awesome RANK function.
The RANK function also has a partition feature that allows you to group the data. This allows the ranking number to reset for every group. Using our example above, we could extend it to show us the heaviest to lightest animals based on the animal type. So there would be a #1 ranking for the mammal, amphibian, bird, and reptile animal types.
Because the ranking number is incremented in the order specified by the ORDER BY clause, what happens if there are duplicates? In our example above, what if there are 2 animals with the same weight? If this occurs, the same ranking number will be applied to each of the duplicate records. So if there were 2 animals with the same weight, they would get the same exact ranking number.
How To Use RANK
The RANK function has 2 parameters. You must always supply the ORDER BY. The PARTITION BY is optional.
PARTITION BY – This is what you would like SQL Server to group your rankings by. In the above example, if we want the ranking number to reset for every different animal types, then we would specify that here. ORDER BY – The ORDER BY is the order that you want your ranking number to be generated in your result set. This is a required parameter. In the example above, we want it from heaviest to lightest. We would specify the column name here that holds the animal weight.
Here is some sample code and the output to show each of the above scenarios.
In the above example, you can see that we are ranking each record by the weight of the animal from heaviest to lightest. We take this a bit further by grouping these by the AnimalType. This is done by passing the PARTITION BY parameter to the function call.
ROW_NUMBER is a function built-in to SQL Server that will return a row number for each record in your result set. You can further change resulting row number to reset the row number based on some value in the result set. I will show you examples of both below.
To show the row number in SQL Server, you need to use the ROW_NUMBER function. This function is broken down in to two parts.
PARTITION BY – If you supply this parameter, then the row number will reset based on the value changing in the columns supplied. This is kinda like using a GROUP BY.
ORDER BY – This is the specified order of the row number that you would like. If you wanted the order of the row number to increment by an employee name (alphabetically), you do that here.
With this function I think examples will help paint the full picture.
In the following query, the results will show a row number with each record. The number will start at 1 and increment for every record in order of AnimalName. You can see that that the order of the row number increment is specified as a parameter in to the ROW_NUMBER() function.
In the above example, the row number never resets. It started at 1 and kept going through all the records. But what if you wanted to reset the row number back to 1 based on a value changing in your result set. In the following example, every time that the AnimalType changes we will reset the row number back to 1. This way, each AnimalType would have it’s own set of row numbers. We accomplish this by listing the columns that we want to group the row numbers by in the PARTITION BY parameter.
As we learned in a previous post, you can use the SOUNDEX function to understand the sound that a word makes. You can then compare it to another word to see if the sound matches. This works okay for exact sound matches. But what if you want check to see if the word sound similar to another word? And how do you find out how similar the words are? SQL Server has you covered. In SQL Server 2005 they released the DIFFERENCE function.
The DIFFERENCE Function
When you use the SOUNDEX function in SQL Server, it returns a 4 character code that represents the sound of the word. The DIFFERENCE function compares two of these 4 character codes and returns a value between 0 and 4. The return value indicates how many of these characters in the 4 character code were the same. Here is an example:
You can see in the example above that if you compare the SOUNDEX code of the two words, only 2 characters are identical. The ‘F’ is different from the ‘B’. The ‘6’ is different from the ‘4’. But the ‘2’ and the ‘0’ are the same between the two codes. Thus… the difference is 2.
Put It All Together
Now that you understand how SOUNDEX and DIFFERENCE work, you can use them to create searches that are based on “sounding like a word”. Here is an example… remember, 0 means the words sound nothing alike and 4 means that the words sound exactly alike. In my example, I am going to match on a likeness of 3… which is sounding pretty similar.