We are building a unique app, and I keep going back and fourth on the best way to store search data.
We have an encypted message that has to be searched, and the database will not be able to decrypt this message, it is done with a outside cypto library.
I do however need to globally search those messages to know which ones contain certain words, hence the issue. My thoughts are to split the words of the message before it is encrypted and create an array of randomized words.
I could either store the message in a varchar comma seperated , or a JSON list and use the string_split like so
declare @myMessage table (myMessageId int,randomizedWords varchar(max))
insert into @myMessage(myMessageId,randomizedWords)
select 1,'hello,hi,machine,sports,Love,I'
union all
select 2,'hello,hi,machine,sports,hate'
declare @mySearch varchar(200) = 'I love sports'
declare @table table(myVal varchar(100))
insert into @table(myVal)
select a.value from string_split(@mySearch,' ') a
select a.* from @myMessage a
outer apply(select top 1 aa.* from @table aa left join string_split(a.randomizedWords,',') bb on aa.myVal = bb.value where bb.value is null) c
where c.myVal is null
Alternatively I can store each word in a detail table and not split which would be faster, but it would create a massive table that would not be used that often such as:
declare @myMessage table (myMessageId int,randomizedWords varchar(100))
insert into @myMessage(myMessageId,randomizedWords)
select 1,value
from
string_split('hello,hi,machine,sports,Love,I',',') a
union all
select 2,value
from
string_split('hello,hi,machine,sports,hate',',') a
declare @mySearch varchar(200) = 'I love sports'
declare @table table(myVal varchar(100))
insert into @table(myVal)
select a.value from string_split(@mySearch,' ') a
declare @perfectMatch int =( select count(*) from @table)
select aa.myMessageId from
@myMessage aa
inner join @table bb
on aa.randomizedWords = bb.myVal
group by aa.myMessageId
having count(*) = @perfectMatch
Both get me the results I want, but again I need to store just a word list randomized and my search has to match all words for the group in order to be returned. I am open to sugestions.
THanks