Best Practices for LARGE SharePoint Lists and Documents Libraries

SharePoint lists and document libraries are not designed to handle large numbers of items, in fact as soon as you add more than a few thousand you will see a warning within the List Settings Page that says:

This list or library contains a large number of items. Learn about managing a large list or library and ensuring that items display quickly.

As part of my product suite I consistently have lists with hundreds of thousands of items in them and have learned many valuable lessons in making these kind of numbers workable.

Hopefully some of what i learned can help you.

1. First off do not use the DataView webpart (from the SharePoint designer) or any xml based rendering webpart as they first grab all the items in the list and then apply the XLST to render and you will face both memory and huge performance problems. The regular list webpart works fine even for large datasets, but you will not be able to get total list counts regardless.

2. If you are planning on sorting or filtering by any columns be sure to add them to the lists indexed fields in the list settings page.

3. SharePoint is a database driven application and the list data is thus stored in database tables. The way it stores them is very denormalized as each columns data is stored in a single row, which means if you have 2 list items with 3 columns you get at least 6 rows of actual sql data. This adds up very fast and will quicky cause both disk and index fragmentation. Always ensure that your logical disks are defragmented once a week and use the SQL Management studio to create an index maintenence plan for all your databases.

4. For Developers using the Object model:

a. Getting counts: never reference the list item collections count property as that will trigger a complete retrieval of all the items in the list and possibly throw memory errors. If you need the list count use list.count instead of list.items.count

b. Item Access: never access the list item collection by index position as it will re-retrieve each item when accessed.

for (int i=0;i< list.items.count;i++) // bad

SPListItem li = list.items[i]; //bad

instead use the standard enumeration model to walk though your collection when needed

foreach (SPListItem li in list.items) //good

d. Paging through large numbers of list items: use the SPQuery and the RowLimit property to page through your larger list collections

SPQuery query = new SPQuery()
query.RowLimit = 10000;
do
{

SPListItemCollection myItems = list.GetItems(query);

for (SPListItem li in myItems)
{
//do something here
}
query.ListItemCollectionPosition = myItems.ListItemCollectionPosition;

} while (query.ListItemCollectionPosition != null);

e. Deleting large numbers of list items (new): use SPWeb.ProcessBatchData whenever possible. Taking above advice here is my idea of the best model for item deletion.

SPList lst = SPContext.Current.Web["MYLIST"];
SPQuery query = new SPQuery();
query.ViewFields = "<FieldRef Name='ID'/>";
query.IncludePermissions = false;
query.RowLimit = 20000;

// first get all the id's to be deleted
ArrayList al = new ArrayList();
do
{
SPListItemCollection myItems = lst.GetItems(query);

foreach (SPListItem item in myItems) // note use enumeration
{
if (item.ID == skipId) continue;
al.Add(item.ID.ToString());

}
query.ListItemCollectionPosition = myItems.ListItemCollectionPosition;
} while (query.ListItemCollectionPosition != null);

StringBuilder sbDelete = new StringBuilder();
sbDelete.Append("<?xml version=\"1.0\" encoding=\"UTF-8\"?><Batch>");
string listguid = lst.ID.ToString();
int bcount = 0;

for (int i = 0; i < al.Count; i++)
{
if (bcount > 1000)
{
sbDelete.Append("</Batch>");
lst.ParentWeb.ProcessBatchData(sbDelete.ToString());
sbDelete = new StringBuilder();
sbDelete.Append("<?xml version=\"1.0\" encoding=\"UTF-8\"?><Batch>");
bcount = 0;

}
bcount++;
sbDelete.Append("<Method>");
sbDelete.Append("<SetList Scope=\"Request\">" + listguid + "</SetList>");
sbDelete.Append("<SetVar Name=\"ID\">" + al[i].ToString() + "</SetVar>");
sbDelete.Append("<SetVar Name=\"Cmd\">Delete</SetVar>");
sbDelete.Append("</Method>");
}

sbDelete.Append("</Batch>");
lst.ParentWeb.ProcessBatchData(sbDelete.ToString());

 

I hope this helps. If you have any further advice please comment on this post and i will incorporate it for everyone. THANKS!

Joel's list of more performance related information

 also see http://go.microsoft.com/fwlink/?LinkId=95450&clcid=0x409  for an earlier white paper on the subject

 

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks
Published Wednesday, April 08, 2009 6:21 PM by notorioustech
Filed under: ,

Comments

# Links (4/9/2009) &laquo; Steve Pietrek - Everything SharePoint and Office

Pingback from  Links (4/9/2009) &laquo; Steve Pietrek - Everything SharePoint and Office

Thursday, May 21, 2009 6:19 PM by Twitted by ferringer

# Twitted by ferringer

Pingback from  Twitted by ferringer

Thursday, May 21, 2009 6:20 PM by Peter

# re: Best Practices for LARGE SharePoint Lists and Documents Libraries

Also:

* Don't have a view displaying all items, if at all possible.

* Specifically exclude these views from the search crawl.

* Expect the object model to intermittently choke/break when enumerating over these lists (yeah).

* Watch out for app pool OutOfMemory exceptions - we had to put in overlapped recycling and scheduled app pool restarts to accomodate. 32bit WFEs of course.

Thursday, May 21, 2009 7:26 PM by notorioustech

# re: Best Practices for LARGE SharePoint Lists and Documents Libraries

The OOM exceptions are avoidable if you page through the results in smaller chunks.

Wednesday, May 27, 2009 1:08 PM by Adam Bryer

# re: Best Practices for LARGE SharePoint Lists and Documents Libraries

Great recommendations. Wish I had read this before trying DVWP. Chokes on a list of about 2900 items.

Thursday, May 28, 2009 7:25 PM by cjacobs

# re: Best Practices for LARGE SharePoint Lists and Documents Libraries

Hello and thanks for this post (especially since the link in the actual SharePoint warning was essentially useless).

I have a very simple list that has one record (with 12 columns) submitted about once every hour, 365 days per year.  The form is just a conversion from an excel "form" that documents hourly job checks for an operations team.  My list was just published two months ago and I am already getting this warning.  

The list only has one view, it is for all items, grouped by create date, with an item limit (batched) of 100.  There isn't really any need to search the list, only occasionally go back a few days.  So on each screen we see about four days worth of data (and click next if we need to go back further). There is also no need for getting any counts.

Should I be concerned about this in terms of performance or hitting an item limit?  If so, is there any way to automate an archiving process for this list?

Thursday, May 28, 2009 7:54 PM by notorioustech

# re: Best Practices for LARGE SharePoint Lists and Documents Libraries

If you are just using the normal list view webparts then you will be fine.

Thursday, May 28, 2009 8:52 PM by cjacobs

# re: Best Practices for LARGE SharePoint Lists and Documents Libraries

Thanks for taking the time to answer my question above!

Monday, June 01, 2009 6:30 PM by Bob Mixon

# re: Best Practices for LARGE SharePoint Lists and Documents Libraries

Great article Chris.  I have used some of these same techniques with libraries containing hundreds-of-thousands of documents.  Glad you took the time to write about it!

Monday, June 01, 2009 6:39 PM by Bob Mixon

# Myths about 2000-item Limits in Lists and Libraries

I have had many discussions and written about this topic in the past and still receive questions about

# sharepoint developer resources &raquo; Blog Archive &raquo; Best Practices for LARGE SharePoint Lists and Documents Libraries

Pingback from  sharepoint developer resources  &raquo; Blog Archive   &raquo; Best Practices for LARGE SharePoint Lists and Documents Libraries

Tuesday, June 02, 2009 5:55 AM by will

# re: Best Practices for LARGE SharePoint Lists and Documents Libraries

If you have not already read the Microsoft white paper on large list do so at blogs.msdn.com/.../scaling-large-lists.aspx

Wednesday, June 17, 2009 9:10 PM by Shane Voda

# re: Best Practices for LARGE SharePoint Lists and Documents Libraries

Thank you for this read. It will be helpful, but I have a question with regards to the Paging attribute of the Common Data View Tasks. Even if you set this attribute to a small number, like 10, are you saying that the entire result set will still be returned AND the XSLT applied before it actually renders the current 10? Your reply will be most helpful as we are now just considering using it for lists that are sure to be tens of thousands deep.

Wednesday, June 17, 2009 9:56 PM by notorioustech

# re: Best Practices for LARGE SharePoint Lists and Documents Libraries

If you look at the underlying XSLT that get's generated you will find that it does load all the rows and then loops through till it gets to the current page x 10 position. you would be better off writing custom webparts that pass the sorting and filtering down to the list query, but paging will still be an issue if you are many pages in as you still have to loop till that point,BUT you won't have to go past it. Hope that isn't too confusion. Thanks for reading and participating!

Tuesday, August 18, 2009 5:08 AM by Twitted by kristiepg9ol

# Twitted by kristiepg9ol

Pingback from  Twitted by kristiepg9ol

Monday, November 02, 2009 10:13 AM by jpmv

# re: Best Practices for LARGE SharePoint Lists and Documents Libraries

Hi,

What is your advice regarding storagE limits?

if a list grows in size above 100GB, splitting the list through several site collections with one list would be a good choice? i'm asking because if splitting inside the same site collection (for example in several subsites) the content database would be the same...

best regards,

Monday, November 02, 2009 1:16 PM by notorioustech

# re: Best Practices for LARGE SharePoint Lists and Documents Libraries

It would be the same content database across site collections too if they are part of the same web application.

If your goal is to provide one "rollup" interface across all the split up lists, this is a bad idea. In order to do that kind of merging you need to load into memory all the items and then combine, facing the same issue that the XML/XSLT webparts face.

Alternatively, putting the items into Folders within the list would give you better control then have multiple lists within the same web application.

Tuesday, November 03, 2009 5:28 AM by jpmv

# re: Best Practices for LARGE SharePoint Lists and Documents Libraries

Hello,

Well but it's possible to create a site collection in a diferent content DB, so if i split a large list into smaller ones inside Site Collections (bellow the main site collection for the web application), and use managed path, i could split the documents in several different SQL Server DBs right? But this creates na overhead on user management and to be able to search for all documents... there is an alternative way? (Microsoft best pratices says that a content dbs should be between 50 ~ 100GB)

best regards

Tuesday, November 03, 2009 8:41 AM by notorioustech

# re: Best Practices for LARGE SharePoint Lists and Documents Libraries

If your primary focus is to reduce the single database size then yes, this could work. It is not standard model to split content database and is a more advanced trick.

There is another option for this too though.

There are a few companies with extensions to the blob store which will instead store the file attachments on the file system instead of the database, reducing size. I would look at doing this and optimizing the one database first before anything else.

Thursday, January 14, 2010 7:02 PM by ??ghyBlog » links for 2009-05-31

# ??ghyBlog &raquo; links for 2009-05-31

Pingback from  ??ghyBlog &raquo; links for 2009-05-31

Thursday, March 18, 2010 5:59 PM by Rudy

# re: Best Practices for LARGE SharePoint Lists and Documents Libraries

Thanks, I was running into performance problems and you remark about list.items.count helped me find the problem. I was using list.folders.count which has the same issue...

Monday, June 21, 2010 7:16 AM by Life in DotNet

# Best Practices for LARGE SharePoint Lists and Documents Libraries

Salve, se vi dovesse capitare di dover lavorare con delle liste sharepoint che contengono molti elementi

Tuesday, July 13, 2010 11:03 PM by nsp

# re: Best Practices for LARGE SharePoint Lists and Documents Libraries

The OOM exceptions are avoidable if you page through the results in smaller chunks.

how this can be done?

I have one list have 38K item, I suspected is this  cause the server out of memory. how do I identify the root cause ? any tools?

thanks  

Wednesday, July 14, 2010 12:09 AM by NSP

# re: Best Practices for LARGE SharePoint Lists and Documents Libraries

i suspected the large sharepoint list cause my sharepoint server faced out of memory problem, how do I confirm this is the root cause? any tool can trace it ?

Wednesday, July 14, 2010 7:29 AM by notorioustech

# re: Best Practices for LARGE SharePoint Lists and Documents Libraries

38k items isn't that much really, are you using any SharePoint designer generated webpart views? Then maybe.

If it is on a page that is hit simultaneously by multiple people then it can add up quickly.

Their isn't really an easy way to trace this as the root cause though, but you can strive to reproduce it. Watch the memory on server and hit the page using a testing tool or multiple people at the same time, measure the bump and multiply by the expected load.

Leave a Comment

(required) 
(required) 
(optional)
(required)