Semantic search is a feature of SQL Server that allows you to perform advanced searches on unstructured text data. With semantic search, you can use natural language search queries to find relevant information in documents and other text-based data stored in SQL Server.
In this article I have explained, how you can implement semantic search using wordnet in SQL Server.
- Download the files
- Restore wordnet and TileDB on SQL Server 2019 or above
- Run below queries to test
SELECT * FROM titles
exec [dbo].[get_noun_verb] 'tension',null
exec [dbo].[sp_search_items] 'someone is in tension',null
select wordnet.[dbo].[fn_get_noun_verb]('foot',null)
exec [dbo].[sp_search_items] 'psycho',null
select wordnet.[dbo].[fn_get_noun_verb]('psycho',null)
select * FROM sys.dm_fts_parser ('"you are in tension"', 1033, 0, 0)
SELECT [wordid]
,[lemma]
FROM [wordnet].[dbo].[words] where lemma like 'psycho%'
Script for function and Stored procedure:
--Function in wordnet database to get related words
CREATE FUNCTION [dbo].[fn_get_noun_verb] (
@word NVARCHAR(200)
,@type CHAR(1) = NULL
)
RETURNS VARCHAR(max) AS
BEGIN
DECLARE @val VARCHAR(MAX);
SELECT @val = STRING_AGG(ISNULL(lemma, ' '), ',')
FROM (
SELECT DISTINCT subq.lemma
FROM words a
INNER JOIN senses b ON a.wordid = b.wordid
LEFT JOIN synsets c ON b.synsetid = c.synsetid
LEFT JOIN samples d ON b.synsetid = d.synsetid
INNER JOIN (SELECT a1.lemma, b1.synsetid FROM senses b1 INNER JOIN words a1 ON a1.wordid = b1.wordid) subq ON subq.synsetid = c.synsetid
AND subq.lemma <> a.lemma
WHERE a.lemma = @word
AND (@type IS NULL
OR c.pos = cast(@type AS VARCHAR))
) t
RETURN @val
END
GO
ALTER PROC [dbo].[sp_search_items] (
@phrase NVARCHAR(200)
, @type VARCHAR(1)
)
AS
BEGIN
SELECT ROW_NUMBER() OVER (
ORDER BY (
SELECT 1
)
) rownum
,occurrence
,display_term
,CONCAT (
display_term
,','
,isnull(wordnet.dbo.fn_get_noun_verb(display_term, @type), display_term)
) words
INTO #serchwords
FROM sys.dm_fts_parser ('"' + @phrase + '"', 1033, 0, 0) where special_term = 'Exact Match' declare @row INT = 1 declare @max INT
CREATE TABLE #items (
[title_id] [varchar](100) NOT NULL
,[title] [varchar](80) NOT NULL
,[type] [char](12) NOT NULL
,[pub_id] [char](4) NULL
,[price] [money] NULL
,[advance] [money] NULL
,[royalty] [int] NULL
,[ytd_sales] [int] NULL
,[notes] [varchar](200) NULL
,[pubdate] [datetime] NOT NULL
,
) select @max = max (rownum)
FROM #serchwords while @row <= @max begin
DECLARE @searchphrase NVARCHAR(2000)
SELECT @searchphrase = CONCAT (
'"'
,words
,'"'
)
FROM #serchwords
WHERE rownum = @row
INSERT INTO #items
SELECT *
FROM titles
WHERE FREETEXT (
notes
,@searchphrase
)
OR FREETEXT (
title
,@searchphrase
);
SET @row = @row + 1 end select *
FROM #items drop TABLE #items drop TABLE #serchwords
END
GO
Wordnet table information
adjpositiontypes – defines three positions that adjectives can take in English language, predicate, attributive and immediatelly postnominal.
adjpositions – links concrete words (adjectives) with their allowed position types in adjpositiontypes table.
linktypes – defines all relation (link) types used in wordnet, about two dozen of them. Both lexlinks and semlinks tables use this table to define the type of each link. Some link types are marked as recursive, meaning that if “furniture” is, for example, a hypernim to a “chair”, then a “chair” is a hyponym to “furniture”.
lexlinks – lexical links, i.e., relations between words. Example:
sad – saddness (derivation)
semlinks – semantic links, i.e. relations between synsets. Example:
chair – furniture (hypernym)
morphs – connected to “words” table, contains irregular word forms. One word can have multiple morphs, and one morph can be an irregular form for multiple words, so you additionally have the morphmaps table. Examples:
abacus (word) – abaci (morph)
abhor (word) – abhorred, abhorring (morphs)
postypes – defines “parts of speech”. Contains only following values:
n – noun, v –verb, a – adjective, r – adverb, s – adjective satellite.
samples – sample sentences for synsets. One synset can have multiple samples.
vframemaps & vframes – vframes define a kind of standard “verb templates”. Vframemaps links words (verbs) with corresponding vframes in which they can appear.
vframesentencemaps & vframesentences – similar to previous two tables, just here you have entire sentences as verb templates.