|
Developer Articles: Function to Replace noise words in a search string.
When running a search on a full text indexed file in Microsoft SQL Server, an error will occur
if any of the words searched for are 'noise' words. These are words that Microsoft consider to
be too common to be worth searching on. Rather like if you search google for 'the' or 'it'.
The function below reads in a list of noise words from a text file, checks them against the search
string and returns the search string with the noise words. To get the noise word text file, just copy
the file called 'noise.eng' or 'noise.enu' (eng is for english words, see other extensions for other languages.)
This file can be found in something like: <Program Files path>\Microsoft SQL Server\Tools\FTDATA\SQLServer\Config\
Useage: Call this function using the following syntax:
strMyCleanSearchString = ClearNoiseWords(strUserSearchString)
' ##############################################################
' ## ClearNoiseWords ##
' ## Author : Andy Sheel - http://www.hexdesign.com ##
' ## Date : 27/04/04 ##
' ## Licence: Free (GNU) but keep this message ##
' ## ##
' ##############################################################
Function ClearNoiseWords(SearchString)
Dim CleanedString ' String to hold cleaned search terms
Dim searchArray ' Array to hold Search Terms
Dim NoiseArray() ' Array to hold Noise Words
Dim objFSO ' FileSystemObject
Dim TS ' TextStreamObject
Dim strLine ' local variable to store Line
Dim strFileName ' local variable to store fileName
Dim ArrayPointer, i, j ' int's to hold array pointers
Dim CurRecords
Const ForReading = 1
Const Create = False
ArrayPointer = 0
' Noise Word File
' Change this to where your noise file is.
strFileName = Server.Mappath("\noise.txt")
' tokenise search string into array
searchArray = split(SearchString, " ")
' Open Noise Word File
Set objFSO = Server.CreateObject("Scripting.FileSystemObject")
' use Opentextfile Method to Open the text File
Set TS = objFSO.OpenTextFile(strFileName, ForReading, Create)
If Not TS.AtEndOfStream Then
ReDim NoiseArray(0)
Do While Not TS.AtendOfStream
strLine = TS.ReadLine
' Read words into an array
CurRecords = uBound(NoiseArray)
REDIM PRESERVE NoiseArray(CurRecords + 1)
NoiseArray(ArrayPointer) = Trim(strLine)
ArrayPointer = ArrayPointer + 1
Loop
End If
' for each search word, compare it with the noise list
' then delete any noise words
For i = 0 to UBound(searchArray)
For j = 0 to UBound(NoiseArray)
If lcase(searchArray(i)) = lcase(NoiseArray(j)) Then
searchArray(i) = ""
End If
Next
Next
' make new string without noicse words
For i = 0 to UBound(searchArray)
CleanedString = CleanedString & searchArray(i) & " "
Next
' Return Clean String
ClearNoiseWords = Trim(CleanedString)
End Function
|
Note: If you want to use this code, please keep the copyright messages. This code is released under the
GNU Licence and is released AS IS.
|
|