Saturday 19 October 2019

sql - When should I use cross apply over inner join?



What is the main purpose of using CROSS APPLY?



I have read (vaguely, through posts on the Internet) that cross apply can be more efficient when selecting over large data sets if you are partitioning. (Paging comes to mind)



I also know that CROSS APPLY doesn't require a UDF as the right-table.



In most INNER JOIN queries (one-to-many relationships), I could rewrite them to use CROSS APPLY, but they always give me equivalent execution plans.




Can anyone give me a good example of when CROSS APPLY makes a difference in those cases where INNER JOIN will work as well?






Edit:



Here's a trivial example, where the execution plans are exactly the same. (Show me one where they differ and where cross apply is faster/more efficient)



create table Company (
companyId int identity(1,1)

, companyName varchar(100)
, zipcode varchar(10)
, constraint PK_Company primary key (companyId)
)
GO

create table Person (
personId int identity(1,1)
, personName varchar(100)
, companyId int

, constraint FK_Person_CompanyId foreign key (companyId) references dbo.Company(companyId)
, constraint PK_Person primary key (personId)
)
GO

insert Company
select 'ABC Company', '19808' union
select 'XYZ Company', '08534' union
select '123 Company', '10016'



insert Person
select 'Alan', 1 union
select 'Bobby', 1 union
select 'Chris', 1 union
select 'Xavier', 2 union
select 'Yoshi', 2 union
select 'Zambrano', 2 union
select 'Player 1', 3 union
select 'Player 2', 3 union

select 'Player 3', 3


/* using CROSS APPLY */
select *
from Person p
cross apply (
select *
from Company c
where p.companyid = c.companyId

) Czip

/* the equivalent query using INNER JOIN */
select *
from Person p
inner join Company c on p.companyid = c.companyId

Answer




Can anyone give me a good example of when CROSS APPLY makes a difference in those cases where INNER JOIN will work as well?





See the article in my blog for detailed performance comparison:





CROSS APPLY works better on things that have no simple JOIN condition.



This one selects 3 last records from t2 for each record from t1:




SELECT  t1.*, t2o.*
FROM t1
CROSS APPLY
(
SELECT TOP 3 *
FROM t2
WHERE t2.t1_id = t1.id
ORDER BY
t2.rank DESC
) t2o



It cannot be easily formulated with an INNER JOIN condition.



You could probably do something like that using CTE's and window function:



WITH    t2o AS
(
SELECT t2.*, ROW_NUMBER() OVER (PARTITION BY t1_id ORDER BY rank) AS rn
FROM t2

)
SELECT t1.*, t2o.*
FROM t1
INNER JOIN
t2o
ON t2o.t1_id = t1.id
AND t2o.rn <= 3


, but this is less readable and probably less efficient.




Update:



Just checked.



master is a table of about 20,000,000 records with a PRIMARY KEY on id.



This query:



WITH    q AS

(
SELECT *, ROW_NUMBER() OVER (ORDER BY id) AS rn
FROM master
),
t AS
(
SELECT 1 AS id
UNION ALL
SELECT 2
)

SELECT *
FROM t
JOIN q
ON q.rn <= t.id


runs for almost 30 seconds, while this one:



WITH    t AS 
(

SELECT 1 AS id
UNION ALL
SELECT 2
)
SELECT *
FROM t
CROSS APPLY
(
SELECT TOP (t.id) m.*
FROM master m

ORDER BY
id
) q


is instant.


No comments:

Post a Comment

php - file_get_contents shows unexpected output while reading a file

I want to output an inline jpg image as a base64 encoded string, however when I do this : $contents = file_get_contents($filename); print &q...