Database, Technology

SQL Server FileStream and Full Text Search

If your are planning to implement a custom document library application using SQL Server and .NET, full-text search comes handy to implement content search on your document. Storing your documents in SQL Server FileSteam gives you many advantages. You can implement Google-like search on your document  as well. In PART I of this post I tried to provide some insight into some of these features. In Part II, I will show you how to configure SQL Server to enable filestream and do full-text search and I will also include a sample .NET application.

SQL Server 2008 provides the functionality for applications and users to issue full-text queries against character-based data in SQL Server tables. SQL Server supports full text indexing on columns with any of the following data types: char, varchar, nchar, nvarchar, text, ntext, image, xml, varbinary, or varbinary(max). Full-text queries perform linguistic searches against text data in full-text indexes by operating on words and phrases based on rules of a particular language such as English. Full-text queries can include simple words and phrases or multiple forms of a word or phrase.

Full-text search supports more than 50 languages, such as English, Spanish, Chinese, Japanese, Arabic and German. For each supported language, SQL Server provides language-specific linguistic components, including a word breaker and stemmer and an empty thesaurus file. For each full-text language, SQL Server also provides a file in which you can optionally define language-specific synonyms to extend the scope of search queries (a thesaurus file).

For writing full-text queries, SQL Server provides a set of full-text predicates (CONTAINS and FREETEXT) and rowset-valued functions (CONTAINSTABLE and FREETEXTTABLE). Using these, applications and users can perform a variety of types of full-text searches, such as searching on a single word or phrase (and optionally ranking the result set), searching on a word or phrase close to another word or phrase, or searching on synonymous forms of a specific word.

SQL Server File Stream

FILESTREAM enables SQL Server-based applications to store unstructured data, such as documents and images, on the file system. Applications can leverage the rich streaming APIs and performance of the file system and at the same time maintain transactional consistency between the unstructured data and corresponding structured data. FILESTREAM integrates the SQL Server Database Engine with an NTFS file system by storing varbinary(max) binary large object (BLOB) data as files on the file system. Transact-SQL statements can insert, update, query, search, and back up FILESTREAM data. Win32 file system interfaces provide streaming access to the data. FILESTREAM uses the NT system cache for caching file data. This helps reduce any effect that FILESTREAM data might have on Database Engine performance. The SQL Server buffer pool is not used; therefore, this memory is available for query processing.

FILESTREAM is not automatically enabled when you install or upgrade SQL Server. You must enable FILESTREAM by using SQL Server Configuration Manager and SQL Server Management Studio. To use FILESTREAM, you must create or modify a database to contain a special type of filegroup. Then, create or modify a table so that it contains a varbinary(max) column with the FILESTREAM attribute. After you complete these tasks, you can use Transact-SQL and Win32 to manage the FILESTREAM data.

When you use FILESTREAM to store binary large object (BLOB) data, you can use Win32 APIs to work with the files. To support working with FILESTREAM BLOB data in Win32 applications, SQL Server provides  couple of functions and some APIs.

PathName returns a path as a token to a BLOB. An application uses this token to obtain a Win32 handle and operate on BLOB data.

GET_FILESTREAM_TRANSACTION_CONTEXT() returns a token that represents the current transaction of a session. An application uses this token to bind FILESTREAM file system streaming operations to the transaction.

The OpenSqlFilestream API obtains a Win32 file handle. The application uses the handle to stream the FILESTREAM data.

All FILESTREAM data container access is performed in a SQL Server transaction. Transact-SQL statements can be executed in the same transaction to maintain consistency between SQL data and FILESTREAM data.

SQL Server Full-Text Engine

The SQL Server Full-Text Engine is a full-text indexing and search engine. In SQL Server 2008, the Full-Text Engine has been fully integrated into the Database Engine. The Full-Text Engine now resides in the SQL Server process rather than residing in a separate process like in the previous versions of SQL Servers. Integrating the Full-Text Engine into the Database Engine has improved full-text manageability, optimization of mixed query, and overall performance. For each instance of SQL Server, there is a dedicated instance of the Full-Text Engine, including dedicated components such as word breakers and filters, resources such as memory, and configuration such as service-level settings at the instance level.


SQL Server Process

Full-text search uses the following components of the SQL Server process:

User tables: These tables contain the data to be full-text indexed.

Crawl Thread (Full-text gatherer): The full-text gatherer works with the full-text crawl threads. It is responsible for scheduling and driving the population of full-text indexes, and also for monitoring full-text catalogs.

Thesaurus files: These files contain synonyms of search terms

Stoplist objects: Stoplist objects contain a list of common words that are not useful for the search.

SQL Server query processor: The query processor compiles and executes SQL queries. If a SQL query includes a full-text search query, the query is sent to the Full-Text Engine, both during compilation and during execution. The query result is matched against the full-text index.

Full-Text Engine: The Full-Text Engine in SQL Server is now fully integrated with the query processor. The Full-Text Engine compiles and executes full-text queries. As part of query execution, the Full-Text Engine might receive input from the thesaurus and stoplist. In SQL Server 2008 and later versions, the Full-Text Engine for SQL Server runs inside the SQL Server query processor.

Indexer: The index writer builds the structure that is used to store the indexed tokens.

Filter daemon manager: The filter daemon manager is responsible for monitoring the status of the Full-Text Engine filter daemon host.

Filter Daemon Host

The filter daemon host is a process that is started by the Full-Text Engine. It runs the following full-text search components, which are responsible for accessing, filtering, and word breaking data from tables, as well as for word breaking and stemming the query input:

Protocol handler: One of its responsibilities is to gather data from the columns being full-text indexed and pass it to the filter daemon host, which will apply filtering and word breaker as required.

Filters: Some data types require filtering before the data in a document can be full-text indexed, including data in varbinary, varbinary(max), image, or xml columns. The filter used for a given document depends on its document type. For example, different filters are used for Microsoft Word (.doc) documents, Microsoft Excel (.xls) documents, and XML (.xml) documents. Then the filter extracts chunks of text from the document, removing embedded formatting and retaining the text and, potentially, information about the position of the text. The result is a stream of textual information.

Word breakers and stemmers: A word breaker is a language-specific component that finds word boundaries based on the lexical rules of a given language (word breaking). Each word breaker is associated with a language-specific stemmer component that conjugates verbs and performs inflectional expansions. At indexing time, the filter daemon host uses a word breaker and stemmer to perform linguistic analysis on the textual data from a given table column. The language that is associated with a table column in the full-text index determines which word breaker and stemmer are used for indexing the column.

Full Text Catalog

The most commonly used indexes in a SQL Server database are clustered and non-clustered indexes that are organized in a B-tree structure. You can create these types of indexes on most columns in a table or a view, except those columns configured with large object (LOB) data types, such as text and varchar(max). Although this limitation is not a problem in many cases, there will be times when you’ll want to query such column types. However, without indexes defined on the columns, the query engine must perform a full table scan to locate the necessary data.

A full-text index is made up of word tokens that are derived from the text being indexed. For example, if the indexed text contains the phrase “tables can include indexes,” the full-text index would contain four tokens: “tables,” “can,” “include,” and “indexes.” Because the list of tokens can be easily searched, full-text queries can quickly locate the necessary records.

A full-text catalog provides a mechanism for organizing full-text indexes. Each catalog can contain zero or more indexes, but each index can be associated with only one catalog. Catalogs are implemented differently in SQL Server 2005 and 2008:

  • SQL Server 2005: A full-text catalog is a physical structure that must reside on the local hard drive associated with the SQL Server instance. Each catalog is part of a specific filegroup. If no filegroup is specified when the catalog is created, the default filegroup is used.
  • SQL Server 2008: A full-text catalog is a logical concept that refers to a group of full-text indexes. The catalog is not associated with a filegroup.

To prevent a full-text index from becoming bloated, SQL Server has a mechanism that discards commonly occurring strings that do not help the search. These discarded strings are called stopwords (aka noise words). During index creation, the Full-Text Engine omits stopwords from the full-text index. This means that full-text queries will not search on stopwords.

Search Filter

When a cell in a varbinary(max), or image column contains a document with a supported document-file extension, full-text search uses a filter to interpret the binary data. The filter, which implements the iFilter interface, extracts the textual information from the document and submits it for indexing. To identify the filters included in an instance of SQL Server, use the sp_help_fulltext_system_components (Transact-SQL) stored procedure, which returns information for the registered word-breakers, filter, and protocol handlers.

Many document types can be stored in a single varbinary(max), or image column. For each document, SQL Server chooses the correct filter based on the file extension. Because the file extension is not visible when the file is stored in a varbinary(max), or image column, the file extension must be stored in a separate column in the table, called a type column. This type column can be of any character-based data type and contains the document file extension, such as .doc for a Microsoft Word document.

When creating a full-text index on a varbinary(max), or image column you must identify a corresponding type column that has the extension information so that SQL Server knows which filter to use. The IDs of the full-text indexed column and its associated type column can be found using the sys.full-text_index_columns catalog view.

After the varbinary(max), or image column is full-text indexed, it can be queried using the search predicates CONTAINS and FREETEXT.



Thesaurus provides the ability to define synonyms and acronyms and use the synonyms in full text searches. With SQL commands like CONTAINS, CONTAINSTABLE, the thesaurus is used to identify expressions or replacements for the searched terms. The thesaurus files are in XML format with a global file (tsGLOBAL.xml) and then 18 language specific files. By default all of the files have sample XML that is commented out. So by default no synonyms are setup when a full text search is issued.

The files are located in SQL_Server_install_path\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\FTDATA\.

The thesaurus file includes a <diacritics_sensitive> tag. This element contains an integer value that controls accent sensitivity. For example, suppose you specify the pattern “café” to be replaced by other patterns in a Full-Text Search query. If the thesaurus file is accent-insensitive, Full-Text Search replaces the patterns “café” and “cafe”. If the thesaurus file is accent-sensitive, Full-Text Search replaces only the pattern “café”. Note that this setting can only be applied one time in the file, and applies to all the search patterns in the file.



In this post, I tried to explain the filestream and full-text search feature in SQL Server. In SQL Sever 2008, there is great improvement in these features compared to older versions of SQL Server.  One of the limitations with full text search is lack of support for soundex search, but we can use Google spell checker or some of the third-party tools such as aspNetSpell or the RADSpell from telerik.

Stay tuned for the PART II .

Social, Technology

Seven Habits Of Highly Effective People



    I. From Dependence to Independence

  • Be ProActive
  • Begin with End in Mind
  • Put First things First

II. From Independence to Interdependence

  • Think WinWn
  • Seek First to Understand, Then to Be Understood
  • SynerGize

III. Continuous Improvement

  • Sharpen the Saw
Patterns, Technology, UI Design

Anti Patterns

Most of you heard a lot about the software design patterns and most you have used it your day to day software design. The Gang of Four (GoF) patterns are generally considered the foundation for all other patterns. Design patterns are recurring solutions to software design problems in application development.  DoFactory is a good place to start exploring the design patterns, if you are new to patterns.

Ok, enough said about the patterns, here I want talk about the anti patterns. 

What is an anti pattern any way?

It is pattern that tells how to go from a problem to a bad solution to that problem. Anti-Patterns generally represent an effort to define and classify reoccurring non-solutions, i.e., things that lots of projects do that fail to yield a solution, or actually prevent a project from working or being finished. Anti patterns tells you why a  bad solution looks very attractive, and why it is bad and what good pattern are applicable in that case. Identifying bad practices can be as valuable as identifying good practices.

You can think of it as the  old adage, “If your only tool is a hammer, everything looks like a nail”.

Anti Pattern Categories

Architectural Anti Patterns

  • Over-Generalized Interfaces:  Architects often attempt to create systems with infinite flexibility, but only succeed in creating systems that are impossible to maintain. With interfaces,  architects fall in love with the xml interface. All communication between components is done through xml. This created a maintenance nightmare where there are no explicit contracts between interfaces.
  • Over buzzword compliance:  Architects love to find a way to shoehorn the latest buzzword into their architectures, even when there is no fit. Whatever the latest buzzword (xml, REST, WCF), architects will find a way to use it.
  • Product Architecture : Some Architects that are only familiar with a certain product or vendor will design the architecture around the products. The result is often just a marketing sheet, and helps very little to solving the problem at hand.
  • FUD Architecture : The fear of being wrong, or creating an architecture that will change later, results in an architecture that actually solves nothing.

Development Anti Patterns

There is several development anti patterns related to Developers, Scoping,  UI, Tools,  Software Strategy and Development process. You can find more details on these categories in Development Anti Pattern Road Map.  I will point out some of the interesting and most frequent ones here.

  • Control Freek: If a product has a custom language for interacting with the system or its database that doesn’t have equivalent APIs in a common programming language, it ‘will be’ costly to use, customize, and maintain. These specialized languages tend to create high priced experts. This steers business to the vendor’s own consulting services. Most commonly seen in big enterprise software “solutions”.
  • Job Keeper: An undocumented code written for the job security of the current programmer, who can allegedly fix it, usually with more DuctTape.
  • Programmers often miss the distinction between what’s needed and what’s “neat”.
  • ‘Thats Not Really An Issue’: Inexperienced team members running a project ignore concerns that they don’t understand.
  • Voodoo Chicken Coding is when something is not fully understood in a programming problem. This results, typically, in a lot of cut-and-pasting from other code that does something similar, but works; arbitrary reordering of statements; and other tricks.

Management Anti Patterns

  • Band Aid: There is a deep-rooted problem in a company/project that will cause the company/project to fail unless addressed.Unlike the human body, a company/project does NOT heal itself once a band-aid is put over a wound. This Anti Pattern is popular because it is usually quick and painless in the short term, but many long term consequences are ignored. Despite the apparent glossy exterior, a project may now be doomed to fail, because the root problems were not addressed completely or at all!
  • Blame Storming: A discussion – usually amongst adjacent peer-group levels, in which members attempt to assign blame for a particular botch. The  blame is spread thinly enough that not too much sticks to any one individual, and in particular, to no-one in the current group.
  • Scape Goat:  The troubles, even failure, of the project are highly visible. The project is late or over budget or in serious technical difficulty. People are angry and disgruntled and some are even leaving. Management is pressuring to rectify issues with the project. Management find a scapegoat to punish through demotion, re-scoping or removal of responsibilities, or banishment to some area of no value or importance, the ScapeGoat usually does get sacked.
  • Idiot Proof Process: A software team is not delivering.You are under pressure to change something, because the project is clearly broken. You spend weeks in process meetings arguing. In each meeting, you attempt to prevent all possible mistakes that a naive or malignant coder could make. Next, you then argue about whether a given mistake is really always a mistake, or is sometimes a good idea. You begin to doubt the competence of everybody in the meeting, including yourself. Lessons Leaned: You cannot develop an idiot-proof process. A good process depends on good developers to execute it.

See Management Anti Pattern Road Map to read more…

User Interface Anti Patterns

  • Use of rocket science to solve simple problems: Sometimes a designer constructs a solution that is much more complex and confusing than the prior solution. The anti-pattern is often seen when designers have explored new ways of doing things that are then forced upon simple problems. An example is use of drag-and-drop techniques for simple attribute setting. Does a shopping cart need drag-and-drop functionality and are you sure you need fade and fold effects to make your user interface better and more understandable?
  • Designing for the perfect use-case: Some say the user is dumb. But the user never stands a chance guessing your initial intentions. Design your interface defensively and provide feedback to unintended input. Think about how your user interface can and is interpreted by different users and remember to do user testing!
  • Bloated user interface: A bloated user interface tries to incorporate as many operations as can possibly fit into it with the end result of confusing more than helping the user to perform his or her task. A bloated user interface often include features that cannot be used on kinds of data the interface handles.

Organizational Anti Patterns

  • Fear Of Success: Past, often successful projects were threatened by visible problems; Managers implement number of disciplines to eliminate the problems – requirements, testing, architecture, management. Managers have seen those things work in the past. The current important project must be protected from any imaginable problem. No risk is too small to add a full-time worker to mitigate it. Management creates a defensive environment, analogous to large numbers of defenders on a soccer field trying to defend the goal and never attacking.
  • Untested but Finished:  The project is lagging in implementation, and the apparent cause is usually a programming bottleneck – too few or too slow programmers. Managers need to demonstrate progress or enlarge their span of influence; at worse, they want to be able to pretend to make progress. Managers hire outsiders to complete a  portion of the work. On delivery from the outsiders, dictate that the work is complete and only needs to be integrated by the main team. But the work that has supposedly been “finished” by outside forces is unfinished and  unusable.

For more information Organizational Anti Pattern.

Grey Patterns Anti Patterns

Generic Data Model: Generic Data Model is a pattern where all the attributes are separated from the object and saved in different table. It is a very flexible solution but in large databases the performance could be terrible.


This is just a scratch on the surface.  Anti Patterns are good tools to achieve better results in an IT project.  They are as important as the design patterns. Anti Patterns help to identify problems in a project and help to avoid implementing a bad solution to that problem.


The five Dysfunctions of a Team

#1 Absence of Trust

The fear of being vulnerable with team members prevents the building of trust within the team.

#2 Fear of Conflict

The desire to preserve the artificial harmony stifles the occurrence of productive, ideological conflict.

#3 Lack of Commitment

The lack of clarity or buy-in prevent the team members from making decisions they will stick to.

#4 Avoidance of Accountability

The need to avoid interpersonal discomfort prevents team members from holding one another accountable for their behaviors and performance.

#5 Inattention to Results

The pursuit of individual goals and personal status erode the focus on collective success.

The Role of a Leader

  1. Go first!
  2. Mine for conflicts.
  3. Force Clarity and Closure
  4. Confront difficult issues.
  5. Focus on collective outcomes

From the book. “The five Dysfunctions of a Team” by Patrick Lencioni

Database, Technology

ACID (Automicity, Consistency, Isolation and Durability)

ACID (an acronymn for Atomicity Consistency Isolation Durability) is a concept that Database Professionals generally look for when evaluating databases and application architectures. For a reliable database all this four attributes should be achieved. It is a set of properties that guarantee database transactions are processed reliably. In the context of database, transaction is a single logical operation the data.


Atomicity means that all transactions must follow “all or nothing” rule. Each transaction is said to be automic. If one part of the transaction fails, the entire transaction fails.

For example, consider an ATM transaction where you are moving money from one account to another. There are two parts in this transaction, first you remove money from one account, then you add money to another account. If one of these two parts fail, The entire transaction is considered invalid, and the transaction must be rolled back to the state before the transaction started.


This means that, the database will always be in a consistent state. Only valid data will be written to the database. If a transaction is executed that violate the consistency rules, the entire transaction will be rolled back. The database will be restored to a state consistent with those rules. If the transaction is successful, that will take the database to a new state which is also consistent with the database consistency rules.

For example, if a column is constrained to be NOT NULL and an application attempts to add a row with a NULL value in that column, the entire transaction must fail, and no part of the row may be added to the database.

In this example, if consistency were not upheld, the NULL value would initially still not be added as part of the row, but the remaining parts of the row would be added. However, since no value would be specified for the NOT NULL column, it would revert to NULL, anyway, and violate the rules of the database.


Multiple transactions occurring at the same time should not impact each others execution. Isolation keep transaction separated from each other until they are finished.

Consider for example, you and your partner and trying to withdraw all your money from your joint account at the same time from different ATMs. The transaction isolation guarantees that these two transactions operate on he database in an isolated manner. The database performs your transaction first and then your partner’s or vice versa. This prevents your transaction from reading intermediate data produced as a side effect of part of your partners’s transaction that will not eventually be committed to the database. However, the isolation property does not ensure which transaction will execute first, merely that they will not interfere with each other.


This ensures that the transaction committed to the database will not be lost. Durability is ensured through the use of database backups and transaction logs that facilitate the restoration of committed transactions in spite of any subsequent software or hardware failures.

If you are buying something from an online shop, if he system show that your credit card is processed and the order is placed, then that order remain in place, even if he system crashes.


Many databases rely upon locking to provide ACID capabilities. Locking means that the transaction marks the data that it accesses so that the DBMS knows not to allow other transactions to modify it until the first transaction succeeds or fails. The lock must always be acquired before processing data, including data that are read but not modified. Non-trivial transactions typically require a large number of locks, resulting in substantial overhead as well as blocking other transactions.

Database, Technology

Nested Loop, Hash and Merge Joins

When I was explaining some of my developers about the SQL Server Execution Plan, we came across the different types of joins SQL Server performs to optimize data retrieval. They are Nested Loop Join, Hash Join, Sort Merge Join. Interesting questions came up, what are the difference between them and how SQL Server determines which algorithm to use. So here is my take on explaining those questions.


Nested Loop Join (or NL Join)

A Nested Join compares each row from the Outer table to each row from the Inner table looking for the rows which satisfy the Join predicate.

For example, if you have statement like this one;

   SELECT A.*, B.* FROM  
   Customer A,  
   Sales B    
   WHERE A.CusomerId = B.CustomerId

It is interpreted as

For Each Row CR in Customer
    For Each Row SR in Sales
        IF CR join with SR Then
          Return (CR, SR)

The total number of rows compared and the cost of this algorithm is proportional to the size of the outer table multiplied by the size of the inner table. Since this cost grows quickly as the size of the input tables grow, in practice we try to minimize the cost by reducing the number of inner rows that we must consider for each outer row.

Lets take a look at the below SQL statement;

set statistics profile on;
select * from Sales.Store A 
Inner Join Sales.SalesPerson B on A.BusinessEntityID = B.BusinessEntityID 

You will see the execution plan as shown below.


Rows    Executes    StmtText    StmtId    NodeId    Parent    PhysicalOp    LogicalOp    Argument    DefinedValues    EstimateRows    EstimateIO    EstimateCPU    AvgRowSize    TotalSubtreeCost    OutputList    Warnings    Type    Parallel    EstimateExecutions

2    1    select * from Sales.Store A  Inner Join Sales.SalesPerson B on A.BusinessEntityID = B.BusinessEntityID    1    1    0    NULL    NULL    NULL    NULL    1    NULL    NULL    NULL    0.00918446    NULL    NULL    SELECT    0    NULL

2    1      |--Nested Loops(Inner Join, OUTER REFERENCES:([B].[BusinessEntityID]))    1    2    1    Nested Loops    Inner Join    OUTER REFERENCES:([B].[BusinessEntityID])    NULL    1    0    7.106E-05    4176    0.00918446    [A].[BusinessEntityID], [A].[Name], [A].[SalesPersonID], [A].[Demographics], [A].[rowguid], [A].[ModifiedDate], [B].[BusinessEntityID], [B].[TerritoryID], [B].[SalesQuota], [B].[Bonus], [B].[CommissionPct], [B].[SalesYTD], [B].[SalesLastYear], [B].[rowguid], [B].[ModifiedDate]    NULL    PLAN_ROW    0    1

18    1           |--Clustered Index Scan(OBJECT:([AdventureWorks2008R2].[Sales].[SalesPerson].[PK_SalesPerson_BusinessEntityID] AS [B]))    1    3    2    Clustered Index Scan    Clustered Index Scan    OBJECT:([AdventureWorks2008R2].[Sales].[SalesPerson].[PK_SalesPerson_BusinessEntityID] AS [B])    [B].[BusinessEntityID], [B].[TerritoryID], [B].[SalesQuota], [B].[Bonus], [B].[CommissionPct], [B].[SalesYTD], [B].[SalesLastYear], [B].[rowguid], [B].[ModifiedDate]    17    0.003125    0.0001757    76    0.0033007    [B].[BusinessEntityID], [B].[TerritoryID], [B].[SalesQuota], [B].[Bonus], [B].[CommissionPct], [B].[SalesYTD], [B].[SalesLastYear], [B].[rowguid], [B].[ModifiedDate]    NULL    PLAN_ROW    0    1

2    18           |--Clustered Index Seek(OBJECT:([AdventureWorks2008R2].[Sales].[Store].[PK_Store_BusinessEntityID] AS [A]), SEEK:([A].[BusinessEntityID]=[AdventureWorks2008R2].[Sales].[SalesPerson].[BusinessEntityID] as [B].[BusinessEntityID]) ORDERED FORWARD)    1    4    2    Clustered Index Seek    Clustered Index Seek    OBJECT:([AdventureWorks2008R2].[Sales].[Store].[PK_Store_BusinessEntityID] AS [A]), SEEK:([A].[BusinessEntityID]=[AdventureWorks2008R2].[Sales].[SalesPerson].[BusinessEntityID] as [B].[BusinessEntityID]) ORDERED FORWARD    [A].[BusinessEntityID], [A].[Name], [A].[SalesPersonID], [A].[Demographics], [A].[rowguid], [A].[ModifiedDate]    1    0.003125    0.0001581    4107    0.0058127    [A].[BusinessEntityID], [A].[Name], [A].[SalesPersonID], [A].[Demographics], [A].[rowguid], [A].[ModifiedDate]    NULL    PLAN_ROW    0    17

In this case the outer table is Sales.Store and the inner table is Sales.SalesPerson. As you can see there is a clustered index on column BusinessEntityID on Sales.SalesPerson table and Clustered Index on BusinessEntityID on Sales.Stores table. In this the SQL optimized determines that the optimal execution is to take SalesPerson as Outer table and Stores as inner table. Thus it performs an Clustered Index Scan on the Sales.SalesPerson table and Clustered Index Seek on the Stores table.

Notice that the index seek depends on A. BusinessEntityID which comes from the outer table of the join, which in this plan is SalesPerson. Each time we execute the index seek, A. BusinessEntityID has a different value. We refer to A. BusinessEntityID as a “correlated parameter”. If a nested loops join has correlated parameters, it is outputted in the plan as “OUTER REFERENCES.” This type of nested loops join where we have an index seek that depends on a correlated parameter is referred as an “index join.”

There are 3 variants of nested Join. In the simplest case, the search scans an entire table or index; this is called a naive nested loops join. If the search exploits an index, it is called an index nested loops join. If the index is built as part of the query plan (and destroyed upon completion of the query), it is called a temporary index nested loops join.

All these variants are considered by the query optimizer. A nested loops join is particularly effective if the outer input is quite small and the inner input is pre-indexed and quite large. In many small transactions, such as those affecting only a small set of rows, index nested loops joins are far superior to both merge joins and hash joins. In large queries, however, nested loops joins are often not the optimal choice.

Merge Join

The merge join requires that both inputs be sorted on the merge columns, which are defined by the equality (WHERE) clauses of the join predicate. The query optimizer typically scans an index, if one exists on the proper set of columns, or places a sort operator below the merge join. In rare cases, there may be multiple equality clauses, but the merge columns are taken from only some of the available equality clauses.

Because each input is sorted, the Merge Join operator gets a row from each input and compares them. For example, for inner join operations, the rows are returned if they are equal. If they are not equal, whichever row has the lower value is discarded and another row is obtained from that input. This process repeats until all rows have been processed.

The full operation is done as shown below.

Get First Row T1R from Table1
Get First Row T2R from Table2

While Not End of Table1 and Not End of Table2
  If T1R joins with T2R
      Get Next Row from Table2
      Returns (T1R, T2R)
  Else T1R < T2R
      Get NEXT Row from Table1
      Get Next Row from Table2
End Loop

the below statement in SQL Server Management Studio, and look at the execution plan.

set statistics profile on;
select * from Production.Product A
Inner Join Production.ProductInventory B On A.ProductID = B.ProductID


Rows    Executes    StmtText    StmtId    NodeId    Parent    PhysicalOp    LogicalOp    Argument    DefinedValues    EstimateRows    EstimateIO    EstimateCPU    AvgRowSize    TotalSubtreeCost    OutputList    Warnings    Type    Parallel    EstimateExecutions
1069 1 select * from Production.Product A Inner Join Production.ProductInventory B On A.ProductID = B.ProductID 1 1 0 NULL NULL NULL NULL 968.269 NULL NULL NULL 0.03063076 NULL NULL SELECT 0 NULL
1069 1 |--Merge Join(Inner Join, MERGE:([A].[ProductID])=([B].[ProductID]), RESIDUAL:([AdventureWorks2008R2].[Production].[ProductInventory].[ProductID] as [B].[ProductID]=[AdventureWorks2008R2].[Production].[Product].[ProductID] as [A].[ProductID])) 1 2 1 Merge Join Inner Join MERGE:([A].[ProductID])=([B].[ProductID]), RESIDUAL:([AdventureWorks2008R2].[Production].[ProductInventory].[ProductID] as [B].[ProductID]=[AdventureWorks2008R2].[Production].[Product].[ProductID] as [A].[ProductID]) NULL 968.269 0 0.009000127 274 0.03063076 [A].[ProductID], [A].[Name], [A].[ProductNumber], [A].[MakeFlag], [A].[FinishedGoodsFlag], [A].[Color], [A].[SafetyStockLevel], [A].[ReorderPoint], [A].[StandardCost], [A].[ListPrice], [A].[Size], [A].[SizeUnitMeasureCode], [A].[WeightUnitMeasureCode], [A].[Weight], [A].[DaysToManufacture], [A].[ProductLine], [A].[Class], [A].[Style], [A].[ProductSubcategoryID], [A].[ProductModelID], [A].[SellStartDate], [A].[SellEndDate], [A].[DiscontinuedDate], [A].[rowguid], [A].[ModifiedDate], [B].[ProductID], [B].[LocationID], [B].[Shelf], [B].[Bin], [B].[Quantity], [B].[rowguid], [B].[ModifiedDate] NULL PLAN_ROW 0 1
504 1 |--Clustered Index Scan(OBJECT:([AdventureWorks2008R2].[Production].[Product].[PK_Product_ProductID] AS [A]), ORDERED FORWARD) 1 3 2 Clustered Index Scan Clustered Index Scan OBJECT:([AdventureWorks2008R2].[Production].[Product].[PK_Product_ProductID] AS [A]), ORDERED FORWARD [A].[ProductID], [A].[Name], [A].[ProductNumber], [A].[MakeFlag], [A].[FinishedGoodsFlag], [A].[Color], [A].[SafetyStockLevel], [A].[ReorderPoint], [A].[StandardCost], [A].[ListPrice], [A].[Size], [A].[SizeUnitMeasureCode], [A].[WeightUnitMeasureCode], [A].[Weight], [A].[DaysToManufacture], [A].[ProductLine], [A].[Class], [A].[Style], [A].[ProductSubcategoryID], [A].[ProductModelID], [A].[SellStartDate], [A].[SellEndDate], [A].[DiscontinuedDate], [A].[rowguid], [A].[ModifiedDate] 504 0.01201389 0.0007114 229 0.01272529 [A].[ProductID], [A].[Name], [A].[ProductNumber], [A].[MakeFlag], [A].[FinishedGoodsFlag], [A].[Color], [A].[SafetyStockLevel], [A].[ReorderPoint], [A].[StandardCost], [A].[ListPrice], [A].[Size], [A].[SizeUnitMeasureCode], [A].[WeightUnitMeasureCode], [A].[Weight], [A].[DaysToManufacture], [A].[ProductLine], [A].[Class], [A].[Style], [A].[ProductSubcategoryID], [A].[ProductModelID], [A].[SellStartDate], [A].[SellEndDate], [A].[DiscontinuedDate], [A].[rowguid], [A].[ModifiedDate] NULL PLAN_ROW 0 1
1069 1 |--Clustered Index Scan(OBJECT:([AdventureWorks2008R2].[Production].[ProductInventory].[PK_ProductInventory_ProductID_LocationID] AS [B]), ORDERED FORWARD) 1 4 2 Clustered Index Scan Clustered Index Scan OBJECT:([AdventureWorks2008R2].[Production].[ProductInventory].[PK_ProductInventory_ProductID_LocationID] AS [B]), ORDERED FORWARD [B].[ProductID], [B].[LocationID], [B].[Shelf], [B].[Bin], [B].[Quantity], [B].[rowguid], [B].[ModifiedDate] 1069 0.007569444 0.0013329 54 0.008902345 [B].[ProductID], [B].[LocationID], [B].[Shelf], [B].[Bin], [B].[Quantity], [B].[rowguid], [B].[ModifiedDate] NULL PLAN_ROW 0 1

The merge join operation may be either a regular or a many-to-many operation. A many-to-many merge join uses a temporary table to store rows. If there are duplicate values from each input, one of the inputs will have to rewind to the start of the duplicates as each duplicate from the other input is processed. If a residual predicate is present, all rows that satisfy the merge predicate will evaluate the residual predicate, and only those rows that satisfy it will be returned.

Merge join itself is very fast, but it can be an expensive choice if sort operations are required. However, if the data volume is large and the desired data can be obtained presorted from existing B-tree indexes, merge join is often the fastest available join algorithm.

Hash Join

The hash join has two inputs: the build input and probe input. The query optimizer assigns these roles so that the smaller of the two inputs is the build input.

Hash joins are used for many types of set-matching operations: inner join; left, right, and full outer join; left and right semi-join; intersection; union; and difference. Moreover, a variant of the hash join can do duplicate removal and grouping (such as SUM(salary) GROUP BY department). These modifications use only one input for both the build and probe roles.

Similar to a merge join, a hash join can be used only if there is at least one equality (WHERE) clause in the join predicate. However, because joins are typically used to reassemble relationships, expressed with an equality predicate between a primary key and a foreign key, most joins have at least one equality clause. The set of columns in the equality predicate is called the hash key, because these are the columns that contribute to the hash function. Additional predicates are possible and are evaluated as residual predicates separate from the comparison of hash values. The hash key can be an expression, as long as it can be computed exclusively from columns in a single row. In grouping operations, the columns of the group by list are the hash key. In set operations such as intersection, as well as in the removal of duplicates, the hash key consists of all columns.


In-Memory Hash Join

The hash join first scans or computes the entire build input and then builds a hash table in memory. Each row is inserted into a hash bucket depending on the hash value computed for the hash key. If the entire build input is smaller than the available memory, all rows can be inserted into the hash table. This build phase is followed by the probe phase. The entire probe input is scanned or computed one row at a time, and for each probe row, the hash key’s value is computed, the corresponding hash bucket is scanned, and the matches are produced.


Grace Hash Join

If the build input does not fit in memory, a hash join proceeds in several steps. Each step has a build phase and probe phase. Initially, the entire build and probe inputs are consumed and partitioned (using a hash function on the hash keys) into multiple files. The number of such files is called the partitioning fan-out. Using the hash function on the hash keys guarantees that any two joining records must be in the same pair of files. Therefore, the task of joining two large inputs has been reduced to multiple, but smaller, instances of the same tasks. The hash join is then applied to each pair of partitioned files.


Recursive Hash Join

If the build input is so large that inputs for a standard external merge sorts would require multiple merge levels, multiple partitioning steps and multiple partitioning levels are required. If only some of the partitions are large, additional partitioning steps are used for only those specific partitions. In order to make all partitioning steps as fast as possible, large, asynchronous I/O operations are used so that a single thread can keep multiple disk drives busy.

Lets take a look at the execution plan of the below sql statement.

set statistics profile on;
select * from Sales.Customer C
Inner Join Sales.SalesOrderHeader O on C.CustomerID = O.CustomerID
You can see SQL optimizer chose Hash Join to retrieve data in this plan.


Rows    Executes    StmtText    StmtId    NodeId    Parent    PhysicalOp    LogicalOp    Argument    DefinedValues    EstimateRows    EstimateIO    EstimateCPU    AvgRowSize    TotalSubtreeCost    OutputList    Warnings    Type    Parallel    EstimateExecutions
31465 1 select * from Sales.Customer C Inner Join Sales.SalesOrderHeader O on C.CustomerID = O.CustomerID 1 1 0 NULL NULL NULL NULL 31098.69 NULL NULL NULL 1.455217 NULL NULL SELECT 0 NULL
31465 1 |--Hash Match(Inner Join, HASH:([C].[CustomerID])=([O].[CustomerID])) 1 2 1 Hash Match Inner Join HASH:([C].[CustomerID])=([O].[CustomerID]) NULL 31098.69 0 0.7886466 389 1.455217 [C].[CustomerID], [C].[PersonID], [C].[StoreID], [C].[TerritoryID], [C].[rowguid], [C].[ModifiedDate], [C].[AccountNumber], [O].[SalesOrderID], [O].[RevisionNumber], [O].[OrderDate], [O].[DueDate], [O].[ShipDate], [O].[Status], [O].[OnlineOrderFlag], [O].[PurchaseOrderNumber], [O].[AccountNumber], [O].[CustomerID], [O].[SalesPersonID], [O].[TerritoryID], [O].[BillToAddressID], [O].[ShipToAddressID], [O].[ShipMethodID], [O].[CreditCardID], [O].[CreditCardApprovalCode], [O].[CurrencyRateID], [O].[SubTotal], [O].[TaxAmt], [O].[Freight], [O].[Comment], [O].[rowguid], [O].[ModifiedDate], [O].[SalesOrderNumber], [O].[TotalDue] NULL PLAN_ROW 0 1
0 0 |--Compute Scalar(DEFINE:([C].[AccountNumber]=[AdventureWorks2008R2].[Sales].[Customer].[AccountNumber] as [C].[AccountNumber])) 1 3 2 Compute Scalar Compute Scalar DEFINE:([C].[AccountNumber]=[AdventureWorks2008R2].[Sales].[Customer].[AccountNumber] as [C].[AccountNumber]) [C].[AccountNumber]=[AdventureWorks2008R2].[Sales].[Customer].[AccountNumber] as [C].[AccountNumber] 19820 0 0.001982 56 0.1179369 [C].[CustomerID], [C].[PersonID], [C].[StoreID], [C].[TerritoryID], [C].[rowguid], [C].[ModifiedDate], [C].[AccountNumber] NULL PLAN_ROW 0 1
0 0 | |--Compute Scalar(DEFINE:([C].[AccountNumber]=isnull('AW'+[AdventureWorks2008R2].[dbo].[ufnLeadingZeros]([AdventureWorks2008R2].[Sales].[Customer].[CustomerID] as [C].[CustomerID]),''))) 1 4 3 Compute Scalar Compute Scalar DEFINE:([C].[AccountNumber]=isnull('AW'+[AdventureWorks2008R2].[dbo].[ufnLeadingZeros]([AdventureWorks2008R2].[Sales].[Customer].[CustomerID] as [C].[CustomerID]),'')) [C].[AccountNumber]=isnull('AW'+[AdventureWorks2008R2].[dbo].[ufnLeadingZeros]([AdventureWorks2008R2].[Sales].[Customer].[CustomerID] as [C].[CustomerID]),'') 19820 0 0.001982 56 0.1159549 [C].[CustomerID], [C].[PersonID], [C].[StoreID], [C].[TerritoryID], [C].[AccountNumber], [C].[rowguid], [C].[ModifiedDate] NULL PLAN_ROW 0 1
19820 1 | |--Clustered Index Scan(OBJECT:([AdventureWorks2008R2].[Sales].[Customer].[PK_Customer_CustomerID] AS [C])) 1 5 4 Clustered Index Scan Clustered Index Scan OBJECT:([AdventureWorks2008R2].[Sales].[Customer].[PK_Customer_CustomerID] AS [C]) [C].[CustomerID], [C].[PersonID], [C].[StoreID], [C].[TerritoryID], [C].[rowguid], [C].[ModifiedDate] 19820 0.09201389 0.021959 47 0.1139729 [C].[CustomerID], [C].[PersonID], [C].[StoreID], [C].[TerritoryID], [C].[rowguid], [C].[ModifiedDate] NULL PLAN_ROW 0 1
0 0 |--Compute Scalar(DEFINE:([O].[SalesOrderNumber]=[AdventureWorks2008R2].[Sales].[SalesOrderHeader].[SalesOrderNumber] as [O].[SalesOrderNumber], [O].[TotalDue]=[AdventureWorks2008R2].[Sales].[SalesOrderHeader].[TotalDue] as [O].[TotalDue])) 1 12 2 Compute Scalar Compute Scalar DEFINE:([O].[SalesOrderNumber]=[AdventureWorks2008R2].[Sales].[SalesOrderHeader].[SalesOrderNumber] as [O].[SalesOrderNumber], [O].[TotalDue]=[AdventureWorks2008R2].[Sales].[SalesOrderHeader].[TotalDue] as [O].[TotalDue]) [O].[SalesOrderNumber]=[AdventureWorks2008R2].[Sales].[SalesOrderHeader].[SalesOrderNumber] as [O].[SalesOrderNumber], [O].[TotalDue]=[AdventureWorks2008R2].[Sales].[SalesOrderHeader].[TotalDue] as [O].[TotalDue] 31465 0 0.0031465 341 0.548631 [O].[SalesOrderID], [O].[RevisionNumber], [O].[OrderDate], [O].[DueDate], [O].[ShipDate], [O].[Status], [O].[OnlineOrderFlag], [O].[PurchaseOrderNumber], [O].[AccountNumber], [O].[CustomerID], [O].[SalesPersonID], [O].[TerritoryID], [O].[BillToAddressID], [O].[ShipToAddressID], [O].[ShipMethodID], [O].[CreditCardID], [O].[CreditCardApprovalCode], [O].[CurrencyRateID], [O].[SubTotal], [O].[TaxAmt], [O].[Freight], [O].[Comment], [O].[rowguid], [O].[ModifiedDate], [O].[SalesOrderNumber], [O].[TotalDue] NULL PLAN_ROW 0 1
0 0 |--Compute Scalar(DEFINE:([O].[SalesOrderNumber]=isnull(N'SO'+CONVERT(nvarchar(23),[AdventureWorks2008R2].[Sales].[SalesOrderHeader].[SalesOrderID] as [O].[SalesOrderID],0),N'*** ERROR ***'), [O].[TotalDue]=isnull(([AdventureWorks2008R2].[Sales].[SalesOrderHeader].[SubTotal] as [O].[SubTotal]+[AdventureWorks2008R2].[Sales].[SalesOrderHeader].[TaxAmt] as [O].[TaxAmt])+[AdventureWorks2008R2].[Sales].[SalesOrderHeader].[Freight] as [O].[Freight],($0.0000)))) 1 13 12 Compute Scalar Compute Scalar DEFINE:([O].[SalesOrderNumber]=isnull(N'SO'+CONVERT(nvarchar(23),[AdventureWorks2008R2].[Sales].[SalesOrderHeader].[SalesOrderID] as [O].[SalesOrderID],0),N'*** ERROR ***'), [O].[TotalDue]=isnull(([AdventureWorks2008R2].[Sales].[SalesOrderHeader].[SubTotal] as [O].[SubTotal]+[AdventureWorks2008R2].[Sales].[SalesOrderHeader].[TaxAmt] as [O].[TaxAmt])+[AdventureWorks2008R2].[Sales].[SalesOrderHeader].[Freight] as [O].[Freight],($0.0000))) [O].[SalesOrderNumber]=isnull(N'SO'+CONVERT(nvarchar(23),[AdventureWorks2008R2].[Sales].[SalesOrderHeader].[SalesOrderID] as [O].[SalesOrderID],0),N'*** ERROR ***'), [O].[TotalDue]=isnull(([AdventureWorks2008R2].[Sales].[SalesOrderHeader].[SubTotal] as [O].[SubTotal]+[AdventureWorks2008R2].[Sales].[SalesOrderHeader].[TaxAmt] as [O].[TaxAmt])+[AdventureWorks2008R2].[Sales].[SalesOrderHeader].[Freight] as [O].[Freight],($0.0000)) 31465 0 0.0031465 341 0.5454844 [O].[SalesOrderID], [O].[RevisionNumber], [O].[OrderDate], [O].[DueDate], [O].[ShipDate], [O].[Status], [O].[OnlineOrderFlag], [O].[SalesOrderNumber], [O].[PurchaseOrderNumber], [O].[AccountNumber], [O].[CustomerID], [O].[SalesPersonID], [O].[TerritoryID], [O].[BillToAddressID], [O].[ShipToAddressID], [O].[ShipMethodID], [O].[CreditCardID], [O].[CreditCardApprovalCode], [O].[CurrencyRateID], [O].[SubTotal], [O].[TaxAmt], [O].[Freight], [O].[TotalDue], [O].[Comment], [O].[rowguid], [O].[ModifiedDate] NULL PLAN_ROW 0 1
31465 1 |--Clustered Index Scan(OBJECT:([AdventureWorks2008R2].[Sales].[SalesOrderHeader].[PK_SalesOrderHeader_SalesOrderID] AS [O])) 1 14 13 Clustered Index Scan Clustered Index Scan OBJECT:([AdventureWorks2008R2].[Sales].[SalesOrderHeader].[PK_SalesOrderHeader_SalesOrderID] AS [O]) [O].[SalesOrderID], [O].[RevisionNumber], [O].[OrderDate], [O].[DueDate], [O].[ShipDate], [O].[Status], [O].[OnlineOrderFlag], [O].[PurchaseOrderNumber], [O].[AccountNumber], [O].[CustomerID], [O].[SalesPersonID], [O].[TerritoryID], [O].[BillToAddressID], [O].[ShipToAddressID], [O].[ShipMethodID], [O].[CreditCardID], [O].[CreditCardApprovalCode], [O].[CurrencyRateID], [O].[SubTotal], [O].[TaxAmt], [O].[Freight], [O].[Comment], [O].[rowguid], [O].[ModifiedDate] 31465 0.5075694 0.0347685 305 0.542338 [O].[SalesOrderID], [O].[RevisionNumber], [O].[OrderDate], [O].[DueDate], [O].[ShipDate], [O].[Status], [O].[OnlineOrderFlag], [O].[PurchaseOrderNumber], [O].[AccountNumber], [O].[CustomerID], [O].[SalesPersonID], [O].[TerritoryID], [O].[BillToAddressID], [O].[ShipToAddressID], [O].[ShipMethodID], [O].[CreditCardID], [O].[CreditCardApprovalCode], [O].[CurrencyRateID], [O].[SubTotal], [O].[TaxAmt], [O].[Freight], [O].[Comment], [O].[rowguid], [O].[ModifiedDate] NULL PLAN_ROW 0 1


Note: If the build input is larger but not a lot larger than the available memory, elements of in-memory hash join and grace hash join are combined in a single step, producing a hybrid hash join.

It is not always possible during optimization to determine which hash join will be used. Therefore,  SQL Server starts using an in-memory hash join and gradually transitions to grace hash join, and recursive hash join, depending on the size of the build input.

If the optimizer anticipates wrongly which of the two inputs is smaller and, therefore, should have been the build input, the build and probe roles are reversed dynamically. The hash join makes sure that it uses the smaller overflow file as build input. This technique is called role reversal.


SQL Server Optimizer determines the best, optimized plan for executing the query. The optimizer evaluate the  indices on the table columns, number of rows in the tables , type of joins  etc. when it chose the best plan. 

Nested loop is generally chosen for smaller tables and if it is possible to do Index seek on the inner table to ensure better performance.

Merge Join is considered to be more efficient for large tables with the keys columns sorted. It only need to read the keys once, and does not require lot of CPU cycles.  

Hash Join is also better suited for large tables. This requires less IO, but need more CPU and requires lot of memory.

That being said, it is possible  to override the optimizer’s choice of the join by using the OPTION hint. However, the optimizer is very smart, it does not make very many bad choices. So use caution when you use the  OPTION hint in your queries.  Updating statistics to fix your indexes help the optimizer to choose  the right Join.