SQL Server Word Sound Comparison

Facebooktwitterredditpinterestlinkedinmail

As we learned in a previous post, you can use the SOUNDEX function to understand the sound that a word makes. You can then compare it to another word to see if the sound matches. This works okay for exact sound matches. But what if you want check to see if the word sound similar to another word? And how do you find out how similar the words are? SQL Server has you covered. In SQL Server 2005 they released the DIFFERENCE function.

 
The DIFFERENCE Function
When you use the SOUNDEX function in SQL Server, it returns a 4 character code that represents the sound of the word. The DIFFERENCE function compares two of these 4 character codes and returns a value between 0 and 4. The return value indicates how many of these characters in the 4 character code were the same. Here is an example:

[table colwidth=”100|100″ width=”200″ colalign=”left|left”]
Word,SOUNDEX Code
Frog,F620
Blog,B420
[/table]

DIFFERENCE: 2

You can see in the example above that if you compare the SOUNDEX code of the two words, only 2 characters are identical. The ‘F’ is different from the ‘B’. The ‘6’ is different from the ‘4’. But the ‘2’ and the ‘0’ are the same between the two codes. Thus… the difference is 2.

 
Put It All Together
Now that you understand how SOUNDEX and DIFFERENCE work, you can use them to create searches that are based on “sounding like a word”. Here is an example… remember, 0 means the words sound nothing alike and 4 means that the words sound exactly alike. In my example, I am going to match on a likeness of 3… which is sounding pretty similar.

DECLARE	@NameToCheckFor	VARCHAR(50)

SET	@NameToCheckFor = 'donky'

SELECT	AnimalName
FROM	Animal
WHERE	DIFFERENCE(AnimalName, @NameToCheckFor) >= 3

 
 
Reference: http://msdn.microsoft.com/en-us/library/ms188753.aspx

SOUNDEX – Discovering The Sounds Of Words In SQL Server

Facebooktwitterredditpinterestlinkedinmail

Let’s face it, not everyone is a perfect speller all the time. Exact match queries are pretty easy to do in SQL Server. But, what if you wanted to correct their spelling or search based on a misspelled word? How is this possible? In SQL Server 2008+ you have the ability to do this with the SOUNDEX function.

SOUNDEX
The SOUNDEX function will take an input word and convert it in to a 4 character code representing the sound of the word. For example: goat -> G300.

So what does this 4 character code mean? The first character represents the first character in the string, the rest of the characters represent the rest of the characters in the string. Of course SQL Server does some magic to come up with this code. If you are interested more in the makeup of this code, see the MSDN link at the bottom of this post. Essentially, this 4 character code represents the sound of the word.

 
How To Use It
You can use this just like any other function in SQL Server.

SELECT	SOUNDEX('dog') AS SoundOfDog

An easy way to search based on word sound would be something like this:

SELECT	*
FROM	Animal
WHERE	SOUNDEX(AnimalName) = SOUNDEX('dug')

The above query would find words like “dog”, “dig”, “dag”

 
Reference: http://msdn.microsoft.com/en-us/library/ms187384.aspx

How To Use The OUTPUT Clause

Facebooktwitterredditpinterestlinkedinmail

The OUTPUT clause is an amazing part of SQL Server that many people do not know about.  As corny as it is, I still remember the day that I found it.  I had been searching for the whole day trying to figure out how to get the identity values from a large amount of data that I just inserted in to the database.  There had to be a way… right?  After searching and searching, I gave up and just accepted that it is not possible… a few days later I found exactly what I was looking for… the OUTPUT clause.

 
The OUTPUT clause is a part of the query that will return data from before or after the operations is completed.  Let’s say that you inserted data in to a table and you wanted the ID column values (which are auto-numbers).  The OUTPUT clause gives you this information!  It can work on INSERT, DELETE, UPDATE, and MERGE statements.  I will take you through examples of each.

 
To access the data that is being altered in your sql statement, you need to use special column prefixes that SQL Server makes available to you.  The two special prefixes are “inserted” and “deleted”.  During an insert statement, the inserted prefix is available for you to use.  During a delete statement, the deleted prefix is available to you.  During the update and merge statements, both the deleted and inserted prefixes are available to you.  In these cases, the deleted represents the data before it was changed and the inserted represents the data after it was changed.

 
One thing to note is that the data being outputted must go in to a table or table variable.

 
OUTPUT Clause On An INSERT Statement

DECLARE	@OutputData	TABLE
		(AnimalID	INT,
		 AnimalName	VARCHAR(50))

-- Insert into the table and stick the resulting animal
--   name and identity column in to a temp variable
INSERT
INTO	Animal
		(AnimalName)
OUTPUT	inserted.AnimalID, inserted.AnimalName
INTO	@OutputData
		(AnimalID, AnimalName)
VALUES	('Pig'),
		('Dog'),
		('Chinchilla')

-- View the inserted data
SELECT	*
FROM	@OutputData

 
OUTPUT Clause On A DELETE Statement

DECLARE	@OutputData	TABLE
		(AnimalID	INT,
		 AnimalName	VARCHAR(50))

DELETE
FROM	Animal
OUTPUT	deleted.AnimalID, deleted.AnimalName
INTO	@OutputData
		(AnimalID, AnimalName)
WHERE	AnimalName = 'Pig'

-- View the deleted data
SELECT	*
FROM	@OutputData

 
OUTPUT Clause On An UPDATE Statement

DECLARE	@OutputData	TABLE
		(AnimalID	INT,
		 OldAnimalName	VARCHAR(50) ,
		 NewAnimalName	VARCHAR(50))

-- Update the table and output the animal name
--   and identity column to the temp table
UPDATE	Animal
SET	AnimalName = 'Swine'
OUTPUT	inserted.AnimalID, deleted.AnimalName, inserted.AnimalName
INTO	@OutputData
		(AnimalID, OldAnimalName, NewAnimalName)
WHERE	AnimalName = 'Pig'

-- View the updated data
SELECT	*
FROM	@OutputData

 
OUTPUT Clause On A MERGE Statement

DECLARE	@OutputData	TABLE
		(AnimalID	INT,
		 OldAnimalName	VARCHAR(50),
		 NewAnimalName	VARCHAR(50))

DECLARE	@ChangeData	TABLE
		(OldAnimalName	VARCHAR(50),
		 NewAnimalName	VARCHAR(50))

INSERT
INTO	@ChangeData
VALUES	('Pig', 'Swine')

-- Update the table and output the animal name
--   and identity column to the temp table
MERGE	Animal AS target
USING	@ChangeData AS source
		ON (source.OldAnimalName = target.AnimalName)
WHEN	MATCHED
THEN	UPDATE
		SET target.AnimalName = source.NewAnimalName
OUTPUT	inserted.AnimalID, deleted.AnimalName, inserted.AnimalName
INTO	@OutputData
		(AnimalID, OldAnimalName, NewAnimalName);

-- View the updated data
SELECT	*
FROM	@OutputData

 
 
Reference: http://msdn.microsoft.com/en-us/library/ms177564.aspx

 

Parse XML With SQL Server

Facebooktwitterredditpinterestlinkedinmail

SQL Server has a lot of useful functionality built-in when it comes to XML.  Today I want to show you an easy way to parse XML data in to a usable format inside your query.

 
For this example, I will be parsing XML data into a temp table, then inserting that data in to a table.

DECLARE	@XMLToParse  XML

-- Load the XML data in to a variable to work with.
-- This would typically be passed as a parameter to a stored proc
SET	@XMLToParse =	'<Animals>
			   <LandAnimals>
				<Animal>Baboon</Animal>
				<Animal>Yak</Animal>
				<Animal>Zebra</Animal>
			   </LandAnimals>
			</Animals>'

-- Declare temp table to parse data into
DECLARE	@ParsingTable  TABLE
	(Animal	VARCHAR(100))

-- Parse the XML in to the temp table declared above
INSERT
INTO	@ParsingTable
	(Animal)
SELECT	xmlData.A.value('.', 'VARCHAR(100)') AS Animal
FROM	@XMLToParse.nodes('Animals/LandAnimals/Animal') xmlData(A)

-- Insert into the actual table from the temp table

INSERT
INTO	Animals
	(AnimalName)
SELECT	Animal
FROM	@ParsingTable

 
The statement that parses the XML has 2 parts to it.

FROM   @XMLToParse.nodes('Animals/LandAnimals/Animal') xmlData(A)

This part says to parse the @XMLToParse variable (previously filled with the XML file) using the method built in to SQL Server called .nodes.  It specifies the repeating node to be Animals/LandAnimals/Animal.  It assigns an aliases this XML parsed records as a table named xmlData.  The (A) is the column name of the rowset.  This will be referenced in the select part of the statement.

 

SELECT  xmlData.A.value('.', 'VARCHAR(100)') AS Animal

The select part of this statement references xmlData (which is the table aliases) and A (which is the column named for that table).  It calls the .value function to return the value from the table/column.  For the .value function, you pass in 2 elements.

The first element is the field.  In this case we are passing in just a period.  We do this because there is no node below the Animal node.  If there was, we would need to specify it here..

The second parameter is the datatype that you would like the value to be casted as.  Then I always aliases the field to something relevant.

 
Although parsing XML can be a little confusing in SQL Server, it is very powerful.  This is a great way to pass bulk data to a stored procedure from any type of client application.

Difference Between UNION and UNION All

Facebooktwitterredditpinterestlinkedinmail

The UNION operation and the UNION ALL operation perform almost the same operation.  They are both used to combine two result sets in to one result set.  The main difference between the two operations is that the UNION operation will return the unique records in the final result set.  The UNION ALL operation will return any duplicates in the final result set.

 
In the examples below, we will assume we have two tables with the following data.  Notice that the values Cat and Dog exist in both tables.
[table width=”0″ colwidth=”100|100″ colalign=”left|left”]
Table1,Table2
Alligator,Cat
Beaver,Dog
Cat,Eagle
Dog,Frog
[/table]

 
The UNION Operation

SELECT	*
FROM	Table1

UNION

SELECT	*
FROM	Table2

[table width=”0″ colwidth=”100″ colalign=”left”]
Result
Alligator
Beaver
Cat
Dog
Eagle
Frog
[/table]

Notice that the duplicate Cat and Dog values were removed from the result.

 
The UNION ALL Operation

SELECT	*
FROM	Table1

UNION	ALL

SELECT	*
FROM	Table2

[table width=”0″ colwidth=”100″ colalign=”left”]
Result
Alligator
Beaver
Cat
Dog
Cat
Dog
Eagle
Frog
[/table]

Notice that the duplicate Cat and Dog values are left in the result.

 
Speed Considerations

One final difference between UNION and UNION ALL is the speed difference.  Because the UNION operation needs to return a distinct set of values, this will take extra processing time.  A general rule is that if you know that the combined data is going to be unique already… just use the plain UNION ALL operation.  Only use the UNION operation if there are duplicates that you would like to have remove.
NOTE:  The UNION and UNION ALL operations are actually one operation.  The ALL keyword is just an attribute.  Because they are so different, I find it is easier to think of them both as separate operations.

 
Reference:  http://msdn.microsoft.com/en-us/library/ms180026.aspx