Standalone code but how to combine into a query...?

QuattroDave · June 6, 2017, 11:25am

Morning all,

I've written (adapted from c#) some SQL to do spherical trigonometry, working out distance from 2 grid references. The example below calculates Hyde Park To Regent's Park in London, as the crow flies 1.76 miles.

Declare @Lat1 As Real = -0.167919
Declare @Long1 As Real = 51.5072682
Declare @Lat2 As Real = -0.1591581
Declare @Long2 As Real = 51.5312705

Declare @dLat1InRad As Real = @Lat1 * (Pi() / 180.0)
Declare @dLong1InRad As Real = @Long1 * (Pi() / 180.0)
Declare @dLat2InRad As Real = @Lat2 * (Pi() / 180.0)
Declare @dLong2InRad As Real = @Long2 * (Pi() / 180.0)

Declare @dLongitude As Real = @dLong2InRad - @dLong1InRad
Declare @dLatitude As Real = @dLat2InRad - @dLat1InRad

Declare @a As Real = Power(Sin(@dLatitude / 2.0), 2.0) + Cos(@dLat1InRad) * Cos(@dLat2InRad) * Power(Sin(@dLongitude / 2.0), 2.0)
Declare @c As Real = 2.0 * Asin(Sqrt(@a))

Declare @kEarthRadiusMiles As Real = 3956.0
Declare @dDistance As Real = Round(@kEarthRadiusMiles * @c, 2)

SELECT @dDistance

Executing by its self it works fine, but I'm really struggeling to work out how to integrate it with a query example, how would I calculate Hyde Park to an employees location??

Declare @Lat1 As Real = -0.167919
Declare @Long1 As Real = 51.5072682
Declare @Lat2 As Real = 0
Declare @Long2 As Real = 0

SELECT Name, Address1, Address2, Postcode, Geolocation.Longitude, Geolocation.Latitude
FROM Employees
FULL OUTER JOIN Geolocation
ON Geolocation.Postcode = REPLACE (Postcode, RIGHT(Postcode, 3), '')
WHERE EmpID = 1001

Many thanks

Dave

JamesK · June 6, 2017, 12:14pm

SQL Server has geometry and geography types, which you may want to investigate. They can calculate distance on a spherical surface, or geographical distances. See here.

If you want to use your own calculations, you can put the calculations into a function and use the function.

-- FUNCTION
CREATE FUNCTION dbo.fnSphericalDistance
(
	@Lat1  FLOAT,
	@Long1 FLOAT,
	@Lat2  FLOAT,
	@Long2 FLOAT
)
RETURNS FLOAT
AS
BEGIN
	DECLARE @distance FLOAT;
	-- Your code to calculate distance here
	RETURN @distance;

END

-- using the function
SELECT dbo.fnSphericalDistanceCalculator
	(g.StartLatiude, g.StartLongitude,
	 g.EndLatitude, g.EndLongitude),
	 YourOtherColumns
FROM
	Employees AS e
	LEFT JOIN GeoLocation AS g ON
		g.PostCode = .....

Kristen · June 6, 2017, 12:22pm

If you are looking for "all places near X" then in the past what we have done is:

Using some basic Pythagoras to calculate top-left and bottom-right co-ords for a box/rectangle that is guaranteed to include all the points (i.e. also including some points outside a "N-miles-from-X" circle, but at least big enough to include the whole circle, including spherical geometry - if that is important to you).

Base the initial query on Lat/Lon BETWEEN that pair of co-ords. If you put an index on Lat and Long columns this part of the query is very fast.

THEN employ more complex calculation on the resulting dataset to find out if it is within the circle (or using more CPU intensive Haversine algorithm etc)

I don't know if this is still necessary when using newer spatial datatypes in recent versions of SQL

Jason_A_Long · June 7, 2017, 8:25pm

I agree with the suggestion of dropping the code into a user defined function. I would, however, suggest that you create it as an inline table valued function (iTVF) rather than a scalar function.

Something like the following...

CREATE FUNCTION dbo.DistanceInMiles
(
    @Lat1 As Real,
    @Long1 As Real,
    @Lat2 As Real,
    @Long2 As Real
)
RETURNS TABLE WITH SCHEMABINDING AS 
RETURN
    SELECT 
        Distance = ROUND(3956.0 * c.c, 2)
    FROM 
        ( VALUES (@Lat1, @Long1, @Lat2, @Long2) ) opv (Lat1, Long1, Lat2, Long2)
        CROSS APPLY ( VALUES (
                                opv.Lat1 * (PI() / 180.0),
                                opv.Long1 * (PI() / 180.0),
                                opv.Lat2 * (Pi() / 180.0),
                                opv.Long2 * (Pi() / 180.0)
                            ) ) rad (dLat1InRad, dLong1InRad, dLat2InRad, dLong2InRad)
        CROSS APPLY ( VALUES (
                                rad.dLong2InRad - rad.dLong1InRad,
                                rad.dLat2InRad - rad.dLat1InRad
                            ) ) d (dLongitude , dLatitude)
        CROSS APPLY ( VALUES (Power(Sin(d.dLatitude / 2.0), 2.0) + Cos(rad.dLat1InRad) * Cos(rad.dLat2InRad) * Power(Sin(d.dLongitude / 2.0), 2.0)) ) a (a)
        CROSS APPLY ( VALUES (2.0 * Asin(Sqrt(a.a))) ) c (c);
GO

QuattroDave · June 12, 2017, 11:33am

HI Jason,

Thanks for your reply, can I ask why you wou8ld use an inline table valued function over a scalar function?

Thanks

Dave

Kristen · June 12, 2017, 1:37pm

Dunno if it is Jason's reason, but I try to avoid Scalar functions as they tend to have terrible performance (compared to TVF)

Jason_A_Long · June 12, 2017, 2:16pm

This is the subject of many full length articles and a full explanation is beyond the scope of a single forum post.
Long story short, inline table valued functions (iTVF) perform much better than scalar & multi-statement (mTVF) functions.

Being that you have code for both a scalar & iTVF, you can test for yourself, create both functions, run them against your data, and compare the execution times.

One thing to keep in mind... Looking at the execution plans will be somewhat deceptive in that the iTVF version will look more expensive. This is because it isn't able to generate an accurate cost of non-inline functions.

Here are a couple of articles that should be worth reading...
Scalar functions, inlining, and performance: An entertaining title for a boring post
Inline Scalar Functions

Jason_A_Long · June 12, 2017, 2:23pm

Sort of... My reasoning is absolutely performance based but the above statement is flawed in that it doesn't differentiate between inline table valued functions (iTVF) and multi-statement table valued functions (mTVF)...
Both scalar AND mTVFs have terrible performance when compared to iTFVs.