Is overlapping range validation possible?

Calab · August 13, 2024, 4:50pm

I have a table with three columns, "label", "low" and "high". When I add a new row to the table, I want to be sure that the range covered from low to high don't overlap with an existing range.

To make matters worse, I do want to be able to use the values in the high column as a valid value in the low column.

e.g.

First     1   10
Second   11   20
Third    20   30
Fourth   25   35   <--- Invalid because it overlaps with the row before it.

Is this validation possible with SQL during the insert? If so, how?

RogierPronk · August 13, 2024, 5:15pm

Yes, you can use the LAG function to use as a validation. If I understand you correcly it would like something like this:

/* Sample data */
DROP TABLE IF EXISTS #TableMinMax
 
SELECT 1 AS RowNmbr, 'First' AS RowText ,1 AS MinValue,10 AS MaxValue
INTO #TableMinMax
UNION
SELECT 2, 'Second',11,20
UNION
SELECT 3, 'Third',20,30
UNION
SELECT 4, 'Fourth',25,35;

SELECT * 
FROM #TableMinMax;

/* LAG function */
SELECT *, 
ISNULL(LAG(MinValue) OVER (ORDER BY RowNmbr),0) AS Prev_MinValue, ISNULL(LAG(MaxValue) OVER (ORDER BY RowNmbr),0) AS Prev_MaxValue,
CASE WHEN ISNULL(LAG(MaxValue) OVER (ORDER BY RowNmbr),0) > MinValue THEN 'Invalid' ELSE 'Valid' END AS Result
FROM #TableMinMax;

ScottPletcher · August 13, 2024, 6:04pm

The surest method would be to use a trigger on the table to insure the no overlapping rows are ever allowed in the table:


USE tempdb;

DROP TABLE IF EXISTS tempdb.dbo.TableMinMax;

SELECT TOP (0) CAST(1 AS int) AS RowNmbr, CAST('First' AS varchar(50)) AS RowText, CAST(1 AS int) AS MinValue, CAST(10 AS int) AS MaxValue
INTO tempdb.dbo.TableMinMax;

CREATE UNIQUE CLUSTERED INDEX [TableMinMax__CL] ON tempdb.dbo.TableMinMax ( MinValue, MaxValue );
GO
CREATE TRIGGER TableMinMax_Prevent_Overlap
ON tempdb.dbo.TableMinMax
AFTER INSERT, UPDATE
AS
SET NOCOUNT ON;
IF EXISTS(
SELECT * 
FROM inserted i
INNER JOIN tempdb.dbo.TableMinMax t ON t.RowNmbr <> i.RowNmbr AND 
    (i.MinValue < t.MaxValue AND i.MaxValue >= t.MinValue))
BEGIN
    RAISERROR('Row overlaps new/existing row, cancelling INSERT!', 16, 1);
    ROLLBACK TRANSACTION;
END /*IF*/
/*end of trigger*/
GO

INSERT INTO tempdb.dbo.TableMinMax
SELECT 1 AS RowNmbr, 'First' AS RowText ,1 AS MinValue,10 AS MaxValue

INSERT INTO tempdb.dbo.TableMinMax
SELECT 2, 'Second',11,20

INSERT INTO tempdb.dbo.TableMinMax
SELECT 3, 'Third',20,30
UNION ALL
SELECT 4, 'Fourth',25,35  --<<-- bad row!!
GO
SELECT 'A', *
FROM tempdb.dbo.TableMinMax
GO
INSERT INTO tempdb.dbo.TableMinMax
SELECT 3, 'Third',20,30
GO
SELECT 'B', *
FROM tempdb.dbo.TableMinMax
/*end of script*/

Calab · August 13, 2024, 9:43pm

I appreciate the responses, but both of these solutions are WAY over my head.

It looks like it would be much easier/cleaner to do my validation in my code.

Thanks!

khtan · August 14, 2024, 4:26am

You may also consider using a check constraint.

refer to this at StackOverflow

SqlHippo · August 14, 2024, 11:01pm

You can't use a check constraint to do this with the same certainty as you can with a trigger. At least, not in SQL Server. The problem is this:

CHECK constraints aren't validated during DELETE statements.

While this may or may not affect this specific problem, it leaves a door wide open for other scenarios to become an issue, so it's best to leave this work to triggers.

I believe scenarios like this are the reason for a concept called an "ASSERTION", but I couldn't tell you which, if any, database engines actually implemented this SQL feature, since they all have triggers, which can cover the same functionality of an ASSERTION.

robert_volk · August 15, 2024, 2:21am

There's another way to do this in SQL Server, using an indexed view, plus a supplemental table of numbers:

SET ANSI_NULLS,ANSI_PADDING,ANSI_WARNINGS,ARITHABORT,CONCAT_NULL_YIELDS_NULL,QUOTED_IDENTIFIER ON;
GO
SET NUMERIC_ROUNDABORT OFF;
GO
DROP VIEW IF EXISTS dbo.no_overlap_ranges;
DROP TABLE IF EXISTS dbo.numbers,dbo.ranges;
CREATE TABLE dbo.numbers(n int NOT NULL PRIMARY KEY CHECK (n>=0));
CREATE TABLE dbo.ranges(lo int NOT NULL, hi int NOT NULL, CHECK(lo<hi), CHECK(lo>=0), CHECK(hi>=1), PRIMARY KEY(lo,hi));
GO
CREATE VIEW dbo.no_overlap_ranges(x) WITH SCHEMABINDING AS
SELECT n.n-1+a.lo
FROM dbo.ranges a
INNER JOIN dbo.numbers n ON n.n BETWEEN 1 AND a.hi-a.lo;              -- allows for hi value to be low value in another range
GO
CREATE UNIQUE CLUSTERED INDEX no_overlaps ON dbo.no_overlap_ranges(x) WITH (DATA_COMPRESSION=PAGE);
GO
SET NOCOUNT ON;
INSERT dbo.numbers VALUES(1);  -- prime it with 1 row
GO
WITH cte(x) AS (SELECT max(n) FROM dbo.numbers) -- use max value to create safe increment, GO 10 to repeat insert 10 times
INSERT dbo.numbers SELECT n+x FROM dbo.numbers CROSS JOIN cte;
GO 10       -- need to GO n to generate 2 ^ n total rows, this is the maximum size of the range (hi-lo)
INSERT dbo.ranges VALUES(1,10);     -- safe, no overlap
--INSERT dbo.ranges VALUES(10,11);   -- safe, no overlap
INSERT dbo.ranges VALUES(11,20);    -- safe, no overlap
INSERT dbo.ranges VALUES(20,30);    -- safe, no overlap
INSERT dbo.ranges VALUES(25,35);    -- fails with overlap

 SELECT * FROM dbo.ranges;

As it says in the comments, the numbers table needs to be populated with enough values to represent the maximum size of the ranges you want to store. If you needed a range of 10,000, then you'd have to populate the numbers table from 1 to 10,000 with no gaps.

The indexed view takes up space in order to populate unique values that can be constrained by a unique index. It's not the best way to accomplish your goal but unless you're going to have lots of data with large ranges, it shouldn't be overwhelmingly large.

By the way, if you are able to use Postgres instead of SQL Server, it does support exclusions:

Edit: logic error in the original view definition, this has been corrected