SQLTeam.com | Weblogs | Forums

Use SQL to create a demographic simulation from arbitrary set of centiles?


#1

I have an arbitrary (Gaussian) non-uniform normal distribution in a table of centile rankings (by increments of 5%... 0, 5, 10, 15 ...). I want to populate this pattern into a pre-existing set of test data.

╔═════════╦═══════════╗
║ Centile ║ Frequency ║
╠═════════╬═══════════╣
║       5 ║         1 ║
║      10 ║         2 ║
║      15 ║         3 ║
// ------------------ //
║      45 ║         8 ║
║      50 ║         8 ║
║      55 ║         9 ║
║      60 ║         8 ║
// ------------------ //
║      95 ║         2 ║
║     100 ║         1 ║
╠═════════╬═══════════╣
║ Total   ║       100 ║
╚═════════╩═══════════╝

And I have 100 records (of student test data) that need to be assigned to each centile at random, but no more than n times.

So 9 records would be updated as belong to the 55th centile, 2 records the 95th centile, etc.

A procedural loop could do this (T-SQL), but I'd like to just cut to the chase & build it out using one SQL statement.

There'll be any number of target records, however, e.g. 2345 records that need to be proportionally assigned, and the demographic source table might be skewed away from a normal distribution (exponential, Weibull, Laplace). I don't want to simply apply uniform random numbers, or nonuniform random numbers (NURN) via a Poisson function, but by an user-set pattern.