MSSQLCity.Com - All about MS SQL
     
About Us  
SSWUG Articles  
Articles  
Administering  
Comparison  
General  
Know How  
Replication  
Tuning  
Undocumented  
UDF  
SQL 6.5  
FAQ  
Scripts  
Tips  
Test Exams  
Advertise  
Download  
History  
Search  
Traffic  
Related Links  
     
Your button logo
Add to Favorites
 
     
 

Microsoft SQL Server Merge Joins

Alexander Chigrik
chigrik@mssqlcity.com


Microsoft SQL Server 7.0/2000 supports three types of join operations:

  • Nested-Loop joins
  • Merge joins
  • Hash joins


  • In this article, I want to tell you about Merge joins and when SQL Server uses this kind of join operation.

    This is a most effective method to join the tables. The merge join will be used, if both inputs are sorted on the merge columns. The best way, if the tables have a clustered index on the column that joins the tables.

    If you have two tables (Table1 and Table2), where n1 - the count of rows in the Table1 table, and n2 - the count of rows in the Table2 table, and there is no index on the column that joins this tables, then in the worse case (with Nested-Loop join) SQL Server will scan n1xn2 rows to return a results set (for each row from the outer table the inner table will be completely scanned).

    In the best case, if Table1 table has a clustered index on the column that joins the tables, and Table2 table has a clustered index on the column that joins the tables, and there is one-to-many relationship between Table1 and Table2, then Merge join will be used, and SQL Server will scan n1 + n2 rows to return a results set.

    This is the example:

    if object_id('dbo.Table1') is not null drop table Table1
    GO
    CREATE TABLE Table1 (Table1_id int primary key CLUSTERED, name char(10))
    GO
    if object_id('dbo.Table2') is not null drop table Table2
    GO
    CREATE TABLE Table2 (
      Table2_id int primary key NONCLUSTERED, 
      Table1_id int,
      name char(10))
    GO
    CREATE CLUSTERED INDEX indTable2 ON Table2 (Table1_id)
    GO
    
    DECLARE @i int
    SELECT @i = 1
    WHILE @i < 1000
      BEGIN
        INSERT INTO Table1 VALUES (@i, LTRIM(str(@i)))
        SELECT @i = @i + 1
      END
    GO
    
    DECLARE @i int
    SELECT @i = 1
    WHILE @i < 1000
      BEGIN
        INSERT INTO Table2 VALUES (@i, @i, LTRIM(str(@i)))
        SELECT @i = @i + 1
      END
    GO
    
    SET SHOWPLAN_TEXT ON
    GO
    SELECT a.Table1_id, b.Table1_id FROM Table1 a INNER JOIN Table2 b
      ON a.Table1_id = b.Table1_id
    GO
    SET SHOWPLAN_TEXT OFF
    GO
    
    This is the algorithm of the Merge join (the description of its work in general case, for many-to-many relationship):

    while (not Table1.eof) and (not Table2.eof) do
      begin
        while Table2.Table1_id > Table1.Table1_id do Table1.MoveToNextRecord();
        value = Table1.Table1_id;
        while Table2.Table1_id < value do Table2.MoveToNextRecord();
        RID = Table1.RowID();
        while Table2.Table1_id = value do
          begin
            while Table1.Table1_id = value do
              begin
                < SELECT Table1.Table1_id, Table2.Table1_id > 
                Table1.MoveToNextRecord();
              end
            Table1.MoveTo(RID);
            Table2.MoveToNextRecord();
          end
      end
    
    Note. If the joined tables are small (contain only one data page, for example), and at least one of the joined tables have index on the column that joins the tables, then SQL Server will use Nested-Loop join instead of Merge join or Hash join (usually).

    Because the query optimizer usually selects the best execution plan for a given select statement, it is not necessary to enforce the desirable join type, but sometimes it can be useful. You can enforce the desirable join type by using the OPTION clause.

    This is the example to enforce Merge join:

    USE pubs
    GO
    SET SHOWPLAN_TEXT ON
    GO
    SELECT a.au_id FROM authors a JOIN titleauthor b
       ON a.au_id = b.au_id OPTION (MERGE JOIN)
    GO
    SET SHOWPLAN_TEXT OFF
    GO
    
    This is the result:

    
    StmtText
    ------------------------------------------------------------------------------------------------
    SELECT a.au_id FROM authors a JOIN titleauthor b
       ON a.au_id = b.au_id OPTION (MERGE JOIN)
    
    (1 row(s) affected)
    
    StmtText
    ------------------------------------------------------------------------------------------------
    |--Merge Join(Inner Join, MERGE:([a].[au_id])=([b].[au_id]), RESIDUAL:([a].[au_id]=[b].[au_id]))
         |--Clustered Index Scan(OBJECT:([pubs].[dbo].[authors].[UPKCL_auidind] AS [a]), ORDERED)
         |--Index Scan(OBJECT:([pubs].[dbo].[titleauthor].[auidind] AS [b]), ORDERED)
    
    (3 row(s) affected)
    

     

     
    Visit The SQL Server Worldwide User's Group for all the latest news and information about SQL Server, Oracle, DB2 and XML for developers and administrators.

    (c) 1997, 2005 Bits on the Wire, Inc