Database planning

I have a table "words" (wordId,word,translation) every word has to be associated with at least one category. I have an additional table "categories" (categoryId,categoryName) what is the best way to associate each word with all the categories it belongs?

one option: an additional table "wordsCategories" (wordId,categoryId)

second option: To add a colomn in the words table that will store all the categories ids that the word belongs, the categories ids will be seperated by a comma. table "words" (wordId,word,translation,categories) insert into words (wordId,word,translation,categories) values (7,"word","travail","9,12,23")

I am open to hear of a better option.

you could go with option #2 but you will realize later on you made a grave mistake and you will need to cut your big toe for such a choice

so I would recommend #1

2 Likes

what about the translation? if there is more than one option, can all the options be written in the translation col (seperated by comma)? or there is another option?

again you can do that but you will pay for it later on, not a good design

create another table maybe as follows.

Create table dbo.Translation(
translationId int,
wordId int,
Translation nvarchar(2500)
LanguageCode varchar(10)
)
1 Like

Do you think,there should be one table for the words I want to translate and additional tables one for each language. or all the words of all languages should be in the main words table in the word col?

look at the sample translation table, what do you see? I think it answers your last question

1 Like

You are right.
one table for the words, and one table for all the translations in all languages.
if you mix up words from all languages in the same column, why dont you do it in the words table? (the translation table will be the words table and it will include the original words)

let go of the idea of stuffing all words in one column. how would you know which word is which language. here is an example of how you could do it.

create table dbo.Words(WordId int not null identity(1,1), Word nvarchar(2500), DefaultLanguageCode varchar(10))

insert into Words
select N'በቆሎ', 'am' 
insert into Words
select N'ቲማቲም', 'am' 
insert into Words
select N'በግ', 'am' 



Create table dbo.Translations(
    translationId int not null identity(1,1),
	wordId int,
	Translation nvarchar(2500),
	LanguageCode varchar(10)
);

insert into Translations
select wordId, 'Corn', 'en'
  from Words where WordID = 1

insert into Translations
select wordId, 'Tomatoes', 'en'
  from Words where WordID = 2

insert into Translations
select wordId, 'Sheep', 'en'
  from Words where WordID = 3

select * From Translations

You could even add a LanguageCode table that lists different languages and use that as a source.
Also be aware of the setting of your database to be able to handle non roman characters. Also depends what you consider your source language, meaning what language are the words in the words table, the source language of the words. I Would recommend not to mix things there. Also remember there could be variation of a language code: en-US, en-au etc.

https://msdn.microsoft.com/en-us/library/ms533052(v=vs.85).aspx

1 Like

Thanks a lot, it is much more clear to me now.
I still dont understand why this option is not better. (in your translation table you are mixing languages in the same column anyway, why not to add the words themselves to the same table)

create table dbo.Words2(
WordId int not null,
Word nvarchar(2500),
LanguageCode varchar(10),
transId int
)

 insert into words2 (wordId,Word,LanguageCode,transId) values (1,N'corn','en',null)
 insert into words2 (wordId,Word,LanguageCode,transId) values (2,N'blé','fr',1)
 insert into words2 (wordId,Word,LanguageCode,transId) values (3,N'maíz','sp',1)
 
 
 insert into words2 (wordId,Word,LanguageCode,transId) values (4,N'apple','en',null)
 insert into words2 (wordId,Word,LanguageCode,transId) values (5,N'Pomme','fr',4)
 insert into words2 (wordId,Word,LanguageCode,transId) values (6,N'manzana','sp',4)
 
 very easy to get a translation of a word in all languages or in a specific language

select word from words2 where transId= (select wordId from words2 where word = 'apple') and `LanguageCode='fr'

what is transId ?

it is a refrence to the wordId that this word translate
eg: the wordId of apple is 4, so each row that translate apple has transId of 4

yes that is doable. imho it just breaks normalization. what you have is a self referencing table usually used for hierarchical data (think ancestry, roots, genes) the value is nullable. this means you are implementing/assuming a business rule (null transId means the source word) by making it nullable. you cant put a referential integrity on it cause it is nullable so someone can add any number they want into that column and there is no validation of sorts.

You as the dev is the only person that knows this rule. it is tribal knowledge. the next person coming in would have to know this. it does not self documenting design. other could chime in with this design approach.

1 Like