Batch Apex - a powerful new functionality in Summer '09
by Nick Simha on May 15, 2009 at 03:09 PM
Among the many new goodies in Summer '09 release is a powerful new feature to do batch processing on your database records. Tasks that require processing of large data volumes without any active human intervention can take advantage of this feature. As an example, consider the task of validating addresses in your contacts when you can potentially have millions of contact records. A batch job would be ideal for this scenario since you can start the batch job, continue to work or even log off while the job continues to execute.
To use this functionality, you need to implement the Database.Batchable interface. You can find an example of the usage in the Apex Code Developer's Guide. The Database.Batchable interface has three methods that you would need to implement as shown below
global class MyBatchTest implements Database.Batchable{
global Database.QueryLocator start() { ... }
global void executeBatch( SObject[] batch) { ... }
global void finish() { ... }
}
The start() method determines the set of records that will be processed by the executeBatch method. You would need to construct a SOQL query and return a QueryLocator object. For example,
return Database.getQueryLocator( 'SELECT Name, MailingAddress FROM Contact' )
would return all contact records for processing. You can ofcourse, make the query as selective as you wish with additional filter criteria. There is a limit of five fifty million records which can be returned by the QueryLocator object. To start a batch job, you create and instance of this class and call the executeBatch method.
MyBatchTest b = new MyBatchTest( ... ) ;
ID myBatchJobID = Database.executeBatch(b) ;
When you call executeBatch on your instance, the system enqueues the job for processing and returns an ID. When the system is ready to execute the job, it calls the start method and then calls the executeBatch method for chunks of 200 records. So if the QueryLocator returned back 1000 records, the executeBatch method will be called five times. The batch job is run using the permission of the user that enqueued the job. The finish method is called after all records have been processed and can be used to perform any post-processing tasks like sending out e-mails etc. The ID returned by the Database.executeBatch method can be used to monitor the status of the job programmatically by querying the AsynchApexJob queue. You can also monitor the job under Setup->Monitoring->Apex Jobs. The documentation has additional details on usage, governor limits and a few best practices. A common question that comes up is the ability to schedule jobs at a certain time or with some periodicity (for example run a job every day at midnight). This feature is not (yet) available. Also, the batch Apex feature is still in preview mode and has to be explicitly provisioned for your org. If you need this feature, please contact support with a short description of your use case. Finally, I would encourage you to sign up for the Summer '09 preview, it has a lot of other cool new features!
TrackBack
TrackBack URL for this entry: http://www.typepad.com/services/trackback/6a00d8341cded353ef01156f8fe3b2970c
Listed below are links to weblogs that reference Batch Apex - a powerful new functionality in Summer '09:

Comments
Posted by Jason on May 15, 2009 04:39 PM:
The lack of scheduling was a bit of a let down for this feature as I would assume most use cases for this involve scrubbing/recalculating data where scheduling would be ideal. It is still cool feature and you may be able to get around this limitation with a trigger cron job.
Any chance we could get the demo code that was shown in the Summer09 webinar that showed the counting of states and filling in the map. I'm curious how this count is maintained as according to the docs, "You cannot use it to pass information between instances of the class during
execution of the batch job".
-Jason
Posted by Nick Simha on May 16, 2009 08:53 AM:
Jason,
Scheduled Apex is on the roadmap and the PMs are well aware of the importance. One workaround (hack?) is to call your batch job in an InboundEmailHandler that can be triggered by a timed workflow task.
Each batch is executed independently so you can't use instance variables to share state information - you can use a database record to keep track of it. The record can be keyed of some initial member variable value. I will find out when the demo code can be posted and update it here.
Posted by Jason on May 18, 2009 08:40 AM:
Cool. I figured there was some crafty workaround to perform scheduling.
Looking forward to the demo code.
Posted by Jason on May 20, 2009 01:58 PM:
Me again, :-P. I see the entry has been tweaked to say that the query locator can return 50 million records instead of 5 million.
The apex reference guide still says 5. Can you confirm this is 50 million records?
5 million would have been plenty for us but, 50 million, that is pretty sweet.
Posted by Ken Koellner on May 21, 2009 01:02 PM:
Two comments-
It would be nice to determine the amount of work via data in a list. Would added a where clause to the query with an "in :myApexList" where myApexList is a List variable work?
It is going to call your execute method with 200 rows at a time. Would that then not be under the standard Apex limits, such as 100 DML operations? If you are still under those limits you might not get the work you need done done.
Posted by Nick Simha on May 21, 2009 01:09 PM:
Jason - yeah - 50M is sweet. I got this number from the PM and I have pinged the doc team for confirmation.
Posted by Jon on May 26, 2009 03:32 PM:
Is it possible to do Apex Callouts with Batch Apex? If not, is it on the roadmap?
Thanks,
Jon
Posted by Nick Simha on May 26, 2009 06:48 PM:
Jon,
Callouts are not allowed at this time. I will check with the PM on the roadmap but I would encourage you to create an idea on the idea exchange http://ideas.salesforce.com/
Nick
Posted by Girish on June 8, 2009 08:55 AM:
Hello Nick,
I have a question on the batch implementation. I understand that batch operations can be performed on a bunch of Salesforce records result from the start Method.
However, can we do other batch operations like parse a CSV file and insert bunch of records into Salesforce. Is that possible at all with the new Batch Enhancements provided by Salesforce. It would be great if you can let me know.
Posted by Jesse on June 8, 2009 03:32 PM:
I have the same concerns as Girish. I'm looking to the batch implemenation so that we can insert our daily leads which are emailed in a csv file.
The comments to this entry are closed.