System.DirectoryServices Search Performance – Part 4

NOTE: This post is part of a series.

Evaluating Performance of LDAP Queries

LDAP search performance sometimes feels like more art than science.  The biggest reason for that is initially there doesn’t appear to be  tools available to evaluate your query and report on efficiency.

STATS Control

LDAP has what is called the STATS control.  When the STATS control is enabled some basic stats of your query are returned; query time, entries returned, entries visited, used filter, and indices used.

I find the most useful information returned from STATS is entries returned vs entries visited.  For example the query:

(telephoneNumber=555-555-5555)

Objects Visited = 11,588

Objects Returned = 1

Since my filter contains no indexed attributes the directory has to crawl through every object to determine if it has a telephoneNumber of 555-555-5555.  Compare that to:

(&(objectCategory=user)(telephoneNumber=555-555-5555))

Objects Visited = 257

Objects Returned = 1

As you can see by adding the indexed attribute of (objectCategory=user) we reduce the number of objects to crawl to 257 since it was able to use an index to find all of the objects that are users.

I could go on for days giving examples of good and bad filters.  The guidelines laid out in Part 2 will give you the knowledge you need to write good filters.  Now with your knowledge of the STATS control you have a way to quantify if your writing efficient filters or not.

AdFind

I find the best way to view the output of the STATS control is to use AdFind by joeware.  AdFind has almost endless versatility in doing LDAP searches, however I am just focusing on it’s ability to report STATS.

Running AdFind.exe without any switches will give you the usage information.  However a quick crash course in AdFind:

adfind.exe –default –f “(attribute=value)” -stats+Only

-default = Connect to default LDAP server (works if your computer is member of domain you want to search)

-f = Filter to query on

-stats+Only = Return stats information with analysis and do not return actual results

An example run of AdFind:

C:>adfind -default -f "(&(objectCategory=user)(telephoneNumber=555-555-5555))" -stats+Only

AdFind V01.37.00cpp Joe Richards (joe@joeware.net) June 2007

Using server: dc.domain.local:389
Directory: Windows Server 2003
Base DN: DC=domain,DC=local

Statistics
=================================
Elapsed Time: 16 (ms)
Returned 1 entries of 257 visited – (0.39%)

Used Filter:
( &  (objectCategory=CN=Person,CN=Schema,CN=Configuration,DC=domain,DC=local)  (telephoneNumber=555-555-5555) )

Used Indices:
idx_objectCategory:220:N

Analysis
———————————
Hit Rate of 0.39% is Inefficient

Indices used:

Index Name  : idx_objectCategory
Record Count: 220  (estimate)
Index Type  : Normal Attribute Index

Filter Breakdown:

(&
  (objectCategory=CN=Person,CN=Schema,CN=Configuration,DC=domain,DC=local)
  (telephoneNumber=555-555-5555)
)

Advertisements

System.DirectoryServices Search Performance – Part 3

NOTE: This post is part of a series.

Advanced Binding Options

We’re going to take a little detour from search performance to talk about high performance binding to the directory.  The first thing a search does is bind to the directory.  And presumably you’ll bind to the objects that your search returns.  So even the most efficient search might not mater if your binding the slowest way possible.  We’ll also talk about how to perform the fastest search possible . . . . not searching.

Connect with FastBind

System.DirectoryServices is a .Net wrapper around ADSI.  When ADSI binds to an object it determines what kind of object it is (eg User, Computer, Group).  This is necessary to map it to a strongly typed ADSI type (eg IADsUser, IADsGroup).  However the DirectoryEntry object does not use any of the functionality of the strongly typed ADSI classes.  The only exception to this is when using the Invoke methods (primarily used to interact with passwords).

Using FastBind in our AuthenticationType prevents ADSI from determining the object type on creation of the DirectoryEntry object.  This means that we can prevent one round trip to the server by using FastBind.  The downsides to using FastBind are:

  • Does not check for existence of object.  This can complicate error handling.
  • Invoke method is unavailable.

FastBind can be combined with any other AuthenticationType  Here we are combining it with AuthenticationType.Secure.

DirectoryEntry de = new DirectoryEntry();
de.AuthenticationType =
   AuthenticationTypes.Secure | AuthenticationTypes.FastBind;

Find Users with Global Catalog Servers

In a multi-domain forest we need a way to find users across the different domains without looking in each individual domain.  There is where the global catalog server comes in.  Global catalog servers contain a read-only subset of information about every object in the forest.  To find a user in the GC and then connect to it’s R/W object in LDAP you need to do the following:

//Connect to Root of GC
DirectoryEntry gc = new DirectoryEntry("GC:");
foreach (DirectoryEntry root in gc.Children)
{
   //we know there's only one child of GC:
   gc = root; break;
}
//search GC for user
DirectorySearcher searcher = new DirectorySearcher();
searcher.SearchRoot = gc;
searcher.Filter =
   "(&(objectClass=user)(sAMAccountName=scevans))";
SearchResult result = searcher.FindOne();
//retrieve DN of user object
string path =
   result.Properties["distinguishedName"][0].ToString();
//connect to user object via LDAP instead of GC
DirectoryEntry user =
   new DirectoryEntry("LDAP://" + path);
Console.WriteLine(user.Properties["sAMAccountName"].Value);

Cache LDAP Connections

The first time you create a DirectoryEntry under the covers a LDAP connection is created.  Future DirectoryEntries will use that some LDAP connection as long as the following criteria are true:

  • The same server and port are used
  • The same credentials are used
  • The same AuthenticationFlags are used (with the exception of the FastBind and ServerBind flags)
  • That a previous DirectoryEntry (original or subsequent) that has been opened is still open.

So for example the following code will only create a single LDAP connection:

DirectoryEntry rootDSE = new DirectoryEntry(LDAP://rootDSE);
//force connection to LDAP
Object temp = rootDSE.NativeObject;
DirectoryEntry user = new DirectoryEntry
   (LDAP://CN=jdoe,OU=Staff,DC=domain,DC=com);
//read attribute from LDAP
//Same LDAP connection used
string username =
   user.Properties["sAMAccountName"].Value.ToString();

However  if the RootDSE object had been disposed before the user object had been created the LDAP connection would have been closed between the two objects.

Creating a connection to LDAP and authenticating is a significant overhead in the process.  The fewer times we create LDAP connections the better our performance will be.  Also repeated LDAP connection creations can cause the servers TCP stack to run out of wildcard ports.

GUID Binding

The fastest way to search for an object is to not have to search for an object.  Active Directory supports what is called GUID binding. You can retrieve the GUID from a DirectoryEntry object, store it somewhere (SQL, file, memory) and use that GUID to rebind to that.  The GUID never changes, so even if the object is moved across domains you will still be able to bind to the object.

//retrieve user via search
DirectorySearcher searcher = new DirectorySearcher();
searcher.Filter =
   "(&(objectClass=user)(sAMAccountName=jdoe))";
SearchResult result = searcher.FindOne();
DirectoryEntry user = result.GetDirectoryEntry();
//get users GUID and dispose object
Guid guid = user.Guid;
user.Dispose();
//bind to user via GUID
user =
   new DirectoryEntry(string.Format("LDAP://<GUID={0}>", guid));
Console.WriteLine(user.Properties["sAMAccountName"].Value);

System.DirectoryServices.Protocols

.Net 2.0 includes the System.DirectoryServices.Protocols namespace.  The major difference between S.DS and S.DS.P is that Protocols does not use ADSI.  This means you have more control over your connections, you get better performance, etc.  However it’s a lot harder to write code in the Protocols namespace.  If performance is the most essential thing though you may want to look at the Protocols namespace.

System.DirectoryServices Search Performance – Part 2

NOTE: This post is part of a series.

Writing Efficient LDAP Filters

Writing efficient LDAP filters sometimes feels like more art than science.  Mostly that’s because the tools to make it a science are not always well known or understood.  We’ll tackle measuring the performance of your queries later in this series, but first let’s go over some basics of good filter design.

Use an Indexed Attribute

Whenever possible include an indexed attribute in your search.  Having an attribute that is indexed will quickly reduce the number of objects to search when filtering the non-indexed attributes.

Generally speaking only a single index will be used per query (more complicated queries may use multiple indexes).  The index that has the most uniqueness will be used.  So for example if you had two attributes, EmployeeID and EmployeeType, that are both indexed EmployeeID is probably more likely to be used as the indexed attribute because it will have less collisions than EmployeeType does.  Basically the LDAP server will try to do what it thinks is the most efficient.

Avoid Substring Searches

The default index is designed to handle wildcard characters at the end of a string relatively well.  For example doing a search for:

(givenName=steve*)

will give us reasonable performance.  However placing wildcard characters at the beginning or in the middle of strings causes significant performance issues.  For example:

(givenName=ste*en)

will produce undesirable performance issues.  Generally speaking writing a filter of:

(|

    (givenName=steve*)

    (givenName=stephen)

)

will produce better performance results.  Starting in Windows Server 2003 you can create medial or tuple index on attributes.  However these index’s are not as efficient as regular index’s.  We will cover creating index’s and testing the performance of your index’s in a future post in this series.

Avoid Using the NOT (!) Operator

The query processor will treat attributes you do not have permission to read and attributes with no value as matches when you use the NOT operator.  This can result in unnecessary objects being returned from you query.  Also the NOT operator prevents the use of indices on those attributes.

Avoid ANR Searches

Ambiguous Name Resolution (ANR) is a great feature introduced in Windows Server 2003 that helps us find a user when we know very little about them.  For example the LDAP filter:

(anr=doe)

Will be executed as:

(|

   (displayName=doe*)

   (givenName=doe*)

   (legacyExchangeDN=doe*)

   (msDS-AdditionalSamAccountName=doe*)

   (physicalDeliveryOfficeName=doe*)

   (proxyAddresses=doe*)

   (name=doe*)

   (sAMAccountName=doe*)

   (sn=doe*)

)

As you can see the anr searches 9 attributes (by default) for us.  Fortunately those attributes are usually indexed so simple ANR searches are not to bad.  However if you execute a lot of ANR searches in a short period of time, or you do multiple ANR searches in the same query (|(anr=doe)(anr=john)(anr=jane)) you can see how the problem can quickly get out of hand.

Avoid Bitwise Comparisons

Some attributes in Active Directory are a representation of bitwise flags.  The most common bitwise attribute is UserAccountControl.  Bit 0x2 in UserAccountControl signifies if the account is enabled or disabled.  So the filter:

(userAccountControl:1.2.840.113556.1.4.803:=2)

returns all accounts that are disabled.  (Replacing the 803 with 804 would return all accounts that are enabled)

A bitwise comparison of an attribute prevents the use of an index for that attribute.  When using a bitwise comparison try to include an index attribute in your query also.

System.DirectoryServices Search Performance – Part 1

NOTE: This post is part of a series.

At Silicon Valley Code Camp we got in a discussion about efficient LDAP queries. I thought it would best to write a few blog posts about performance considerations of searching when using System.DirectoryServices (including LDAP queries). I’m using the MSDN article Creating More Efficient Microsoft Active Directory-Enabled Applications as the basis for my posts, however personal experience and other sources are sprinkled in heavily. I find the MSDN article talks about performance from a statistical stand point as opposed to a real world standpoint.

Today’s post is about configuring your DirectorySearcher object. Future posts will cover:

  • LDAP Filters
  • Advanced Binding Options
  • Analyzing Performance of Searches
  • Creating New Indexes

Some of this information is specific to Active Directory/ADAM. However many if not most of the concepts are universal to all LDAP servers.

Set Appropriate Search Root

If you were searching your computer for a file you would have your searching tool start at the closest folder that you could to where the file could possibly be. For example if you know the file is in c:Windows you would start the search there instead of C:. The closer you get to the file, the faster the search will be.

The same concept applies to LDAP queries. If you know the object is under OU=Staff,DC=domain,dc=com then you would want to start searching there.

In .Net this is easy to do. On your DirectorySearcher object set the property SearchRoot to where you want the search to start.

DirectorySearcher searcher = new DirectorySearcher();
searcher.SearchRoot =
   new DirectoryEntry(LDAP://OU=Staff,DC=domain,DC=local);

If you do not set the SearchRoot (or set it to null) the search will look at RootDSE of the “default” LDAP server (meaning this only works on an AD connected client) and start the search from the DefaultContext. That’s a fancy way of saying that you will search your entire domain.

Set Appropriate Search Scope

There are three kinds of search scope:

  • Base – Only search the object specified as the SearchRoot
  • One-Level – Only search direct children of the SearchRoot
  • Subtree – Search all children of the SearchRoot, including sub-children.

In .Net the base level search is often not necessary because if we know the path to the object we can simply create a DirectoryEntry object for it. However there are circumstances where it becomes useful.

The theory of setting search scope is similar to setting an appropriate search root. If you know the object is in the SearchRoot container there is no need to iterate through sub-OU’s. The fewer objects the directory has to search the faster your query will be.

Setting your search scope in .Net is once again trivial. Here we set it to One-Level.

DirectorySearcher searcher = new DirectorySearcher();
searcher.SearchScope = SearchScope.OneLevel;

Set PropertiesToLoad Property

You can define which LDAP attributes should be returned when searching for an object. The more attributes returned, the more processing power to retrieve those attributes, the more data on the network, which means the longer it will take to retrieve the data you need.

In .Net if you do not specify the PropertiesToLoad property it returns the default set of attributes. When searching against my Windows 2003 Forest it retrieves 25 LDAP attributes. I have no idea where it gets that list from, or if it is modifiable.

To set your own list of PropertiesToLoad simply use the following code:

DirectorySearcher searcher = new DirectorySearcher();

searcher.PropertiesToLoad.Add("sAMAccountName");

Use Paged Searches

By default Active Directory will only return a maximum of 1000 results. Only being able to search for up to 1000 objects would make the service useless as a directory server. LDAP servers provide a method called paging to return larger result sets by allow clients to request the next page of objects. This prevents excessive memory use by the server (and the client). By paging the results it only has to store up to 1000 objects at a time in memory.

To enable paging in .Net all we need to do is set the property PageSize to a value larger than 0. Ideally you would set the PageSize to be the same as the servers MaxPageSize value to reduce the number of pages required to iterate through the entire set. If you set the PageSize to a value larger than the MaxPageSize the server will still return the number of objects specified by MaxPageSize.

DirectorySearcher searcher = new DirectorySearcher();

searcher.PageSize = 1000;

After PageSize is set paging happens seamlessly. If you foreach through the SearchResultCollection it seamlessly goes from object 1000 to object 1001.

SearchResultCollection results = searcher.FindAll();
foreach (SearchResult result in results)
{
   Console.WriteLine(result.Path);
}

If you step through your code you will see network activity and a visible delay when you try to read object 1001.

Advanced DirectorySearcher Settings

Here are some more advanced and less prominent settings you can change on your DirectorySearcher objects.

The property PropertyNamesOnly causes the directory to only return the names of properties and does not retrieve their value. This is useful if you only want to see if an object has a value for a certain attribute, or if you’re going to convert to a DirectoryEntry object anyways. Converting to a DirectoryEntry object causes a new call to LDAP which will then create an object cache and populate the properties values anyways.

The property ServerPageTimeLimit sets how long the server should search for objects before returning a page of results. This is useful if you are doing a complicated query and want to retrieve results after so many seconds, even if a full page of results has not been retrieved yet. With this setting enabled you will either receive a page of results when the PageSize limit is reached, or the specified amount of time as elapsed.