ADO.NET DataAdapter (part 2)

This article was originally published in VSJ, which is now part of Developer Fusion.
In last month's VSJ I discussed how to setup the DataAdapter using a variety of techniques to construct and populate one or more DataTable objects associated with a chosen DataSet. Once the data has been loaded into client memory, the next tasks are to display the data to the user (if necessary), accept any changes to the data, record them in the in-memory DataTable, and eventually, post these changes to the source database. This article outlines each of these steps and focuses special attention on updating data on the server.

As I said in Part 1, the role of ADO.NET in regard to managing updates has been radically reduced when compared to COM-based ADO “classic” (ADOc). With ADO.NET you are saddled with the responsibility of creating functional, well-performing and syntactically correct SQL UPDATE, DELETE and INSERT queries to post changes to the database. Yes, you can use the DACW or the CommandBuilder to build these commands, which are used when the DataAdapter Update method is executed. However, you'll find it's expensive to use the CommandBuilder (it takes an extra round trip and considerable processing), and the efficiency of the SQL is very limited. The CommandBuilder also does not handle complex SELECT queries and no attempt is made to use TimeStamp or other more sophisticated concurrency checking mechanisms such as TimeStamp columns. Most developers quickly learn to bypass the CommandBuilder and write their own action commands.

Update strategies

Updating data is a fairly complex subject as there are as many types of databases as there are database designers – or so it seems. Some developers and DBAs are comfortable with direct table access, but most are not. This means that the DBA usually does not grant developers the right to access base tables – just Views, or (more typically) stored procedures. All too often a single-table update won't work anyway. For example, when one adds a new order item, the item (and all other items in the order) often need to be added as a transaction along with the order itself. In my experience, this is usually accomplished using a business object or a stored procedure. As we evolve to the Yukon version of SQL Server, the distinction between stored procedures and business objects will diminish; but I digress. For purposes of this article, we'll discuss the two techniques of using direct table updates and using stored procedures to change the data separately.

Changing data in the DataTable

Before we start talking about how the DataAdapter Update method executes the “right” action query, we need to discuss how to change the data in your DataTable. That might seem pretty trivial, but if you're working with an ASP.NET application, this is not done for you – even if your DataTable (or DataView) is bound to one or more controls. Let's talk about this scenario first, and then we'll visit the Windows Forms data binding issues.

ASP.NET data binding

When your user changes data in a bound control, the values are returned to your application in the ViewState and the server-side object properties are repopulated with these changes. However, since the server does not automatically persist the DataTable (or DataSet), it's up to you to reconstruct the DataTable or (more typically) extract it from the Session state. Once you reconstitute the DataTable, it's your responsibility to change the DataRow Item Value properties to their current state based on the Value property of the bound control(s) or the cell values of a data grid. As an example, the block of code shown below is used in my cookbook recipe web site to post changes to a selected DataRow just before I post the changes to the database. Note that the DataSet is reconstituted from the Session state, the row being edited is addressed and each Item (column) in the row has its Value property set with the Value of a bound TextBox. I then execute an Update on the DataSet which executes the appropriate action query based on the changes made.
Private Sub btnSave_Click(ByVal sender As System.Object, _
									ByVal e As System.EventArgs) Handles btnSave.Click
	dsRecipeEdit1 = CType(Session(“dsRecipeEdit”), DataSet)
	drRecipe = dsRecipeEdit1.Tables(0).Rows(0)
	With drRecipe
		.Item(“Title”) = txtTitle.Text
		.Item(“Category”) = cbCategory.SelectedItem
		.Item(“SubmittedBy”) = txtSubmittedBy.Text
		.Item(“Source”) = txtSource.Text
		.Item(“Ingredients”) = txtIngredients.Text
		.Item(“Directions”) = txtDirections.Text
		.Item(“ServingSuggestions”) = txtServingSuggestions.Text
		.Item(“Serves”) = txtServes.Text
		.Item(“TimeToPrepare”) = txtTimeToPrepare.Text
		.Item(“PhotoPath”) = imgPhoto.ImageUrl
		.Item(“ChangedDate”) = Now
	End With
	daRecipeList.Update(dsRecipeEdit1)

Windows forms data binding

While Windows forms (WinForm) data binding is “full-duplex”, there are still a few issues that might make it tough to understand why the Update method does not seem to do anything when you expect that it should. If you want the Update method to post changes to the database, ADO.NET has to know that the DataSet in question has changes to post. No changes means no round trip to the server and no evidence that the Update method did anything. This means it's important to get ADO.NET to correctly set the HasChanges property for you – it gets set to True when any edit made to a DataRow has ended. ADO.NET and Windows can't figure this out for you in some cases, so you'll usually need a way to commit the row edit on your own.

ScreenshotConsider that in WinForm applications, bound controls hold the data value and its changes until ADO.NET thinks that the data should be posted to the underlying DataRow. For example, if you create a form with one or more bound controls, as shown on the right, you'll find that the critical DataSet.HasChanges property does not change state when the contents of the TextBox controls change unless you move focus to another control before clicking Update. If it did, HasChanges would change state when you simply navigated to another row. However, it does change automatically when you change the CurrencyManager.Position property – assuming you have changed a bound control. However, this would mean that your user would have to remember to either change focus (click on another control) or change the currency pointer to get the binding manager to bind to another row. In the example, I use a TrackBar control to set the Position property so users can navigate through the bound rowset.

Let's take a look at the code used to make all of this work and examine how to commit (end) the edit and tell ADO.NET to post the changes to the DataRow.

The code shown below establishes a SqlClient connection, runs a query and constructs a DataTable from the rowset.

Dim cn As SqlConnection
Dim da As SqlDataAdapter
Dim ds As New DataSet
Dim dt As New DataTable
Dim cb As SqlCommandBuilder
Dim cm As CurrencyManager
Private Sub Form1_Load(ByVal sender As System.Object, _
									ByVal e As System.EventArgs) Handles MyBase.Load
	Try          ' Trap problems during the open
		cn = New SqlConnection(“server=demoserver;database=biblio;...”)
		da = New SqlDataAdapter(“Select Au_ID, Author, Year_Born “ _
				& “FROM authors WHERE Year_Born < 1900”, cn)
		cb = New SqlCommandBuilder(da)
		da.Fill(ds)
		dt = ds.Tables(0)
		' Add data bindings manually
		txtAu_ID.DataBindings.Add(New Binding(“Text”, dt, “AU_ID”))
		txtAuthor.DataBindings.Add(New Binding(“Text”, dt, “Author”))
		txtYearBorn.DataBindings.Add(New Binding(“Text”, dt, “Year_Born”))
		' Specify the CurrencyManager for the DataTable
		cm = CType(Me.BindingContext(dt), CurrencyManager)
		' Set the initial Position of the control
		cm.Position = 0
		tb1.Maximum = dt.Rows.Count
		' dt.Rows(cm.Position).BeginEdit()      ' Not needed
		Timer1.Enabled = True
		Catch ex As Exception
		MsgBox(ex.ToString)
	End Try
End Sub
Yes, I use the CommandBuilder to construct the action queries – so sue me. The code then binds three TextBox controls to the columns returned by the query and saved in the DataTable. I setup a CurrencyManager to help position the “cursor” at the currently bound row and set the Position to 0 – the first row. I also set the TrackBar control's Maximum property to reflect the number of rows in the Rows collection. The timer is used to poll the DataSet.HasChanges property to see when it changes state from False (when the DataTable has not changed) to True (when a change has been posted to the DataTable).

The listing below contains the next block of code which manages the changes made so they are correctly posted to the DataTable when things change in the TextBox controls.

Private Sub tb1_Scroll(ByVal sender As System.Object, _
									ByVal e As System.EventArgs) Handles tb1.Scroll
	cm.Position = tb1.Value
End Sub

Private Sub btnUpdate_GotFocus(ByVal sender As Object, _
						ByVal e As System.EventArgs) Handles btnUpdate.GotFocus, _
						btnUpdate.MouseHover
	' To deal with the situation where the TextBox is changed and the
	' user does not move to another row, cell or control before clicking
	' the Update button
	dt.Rows(cm.Position).EndEdit()
End Sub
As you can see, there are a couple of events where I placed the strategic line of code to call EndEdit on the current DataRow – I trapped MouseHover and GotFocus events on the Update command button. This means if the user even thinks about doing an update, the EndEdit method is called. This makes sure that when the Update method is actually called, ADO.NET senses the fact that there are changes to post to the server (if any rows have actually changed).

Another approach might be to add an EndEdit to the TextBox Changed event. This makes sense in a way, but if you do this, simply navigating to a new row makes ADO.NET think that the row has changed when it has not.

Direct table updates

The DACW and CommandBuilder were designed to make this type of update pretty simple – at least when it comes to generation of the action queries and the data mapping. By using the SelectCommand to query the server at design time (the DACW) or runtime (the CommandBuilder), ADO.NET automatically generates the required UpdateCommand, InsertCommand and DeleteCommand. Of course, this also assumes that the SelectCommand is not too complex. In this case you're going to have to generate these action queries yourself one way or another.

I devote an entire chapter in my book ADO.NET and ADO Examples and Best Practices for Visual Basic Programmers – 2nd Edition to discussing how to construct the SQL for update statements. In a nutshell, there are several issues to be addressed – even when working with direct table updates:

  • Concurrency: When you're working in a multi-user environment, you need to be concerned about changes to rows that other users might also be changing. In my experience, it's usually easier to design-in concurrency than program around it. If traffic engineers thought about intersections the way that programmers think about data collisions, they would spend all of their time worrying about where to place the ambulances instead of how to time the traffic lights to avoid collisions in the first place. When working in a disconnected environment, pessimistic locks don't make much sense and optimistic concurrency is all you can depend on. This means you need to place a lot more emphasis on designs that eliminate the chance of collisions whenever possible.
  • Identity: As new rows are added to a database table, the SQL engine sets the Identity column in the new row (assuming there is one). The challenges here involve managing client-side identity values and refreshing the server-generated value post INSERT. One widely accepted technique is to set the AutoIncrement and AutoIncrementSeed properties to -1. This creates a set of client-side values that can be used to relate parents and children. Since these values are not passed to the server when you insert new rows, you need to retrieve the server-created values after the INSERT. This functionality is built into ADO.NET as it can capture either a separate rowset or output parameter containing the new identity value.
  • Timestamps: ADO.NET has no built-in mechanism to deal with Timestamp-managed concurrency. Since most sophisticated (and some pretty simple) DBMS systems use timestamps to simplify concurrency checking, it's important to be able to fold in this technology into your design. In this case you're going to have to write the UpdateCommand SQL yourself to compare the current (server-side) timestamp value with the current row's timestamp value. If the values are different, the UPDATE should be rolled back – there has been a collision. Once a row has been updated (or a new row has been inserted) you need to fetch the newly generated timestamp value, just as you fetched the new Identity value post INSERT.
  • Permissions: As your DBA begins to manage the security of your DBMS more closely, you'll need to find ways to work with his or her security constraints. These often include restrictions on the DBMS objects you're going to be able to change. In some cases this means you won't be able to access the tables at all – you'll have to use Views or stored procedures instead. In other cases you might have update access to a database table, but to only certain columns. This means the Visual Studio .NET wizards and drag-and-drop tools will generate code that will have to be tuned to eliminate access to restricted columns or data tables.
  • Relationships: After all, most of us are working with relational databases. This means that our updates often (usually) involve more than one table and these changes cannot be permitted to proceed as individual updates, but as transaction-managed updates that rollback if a dependant change fails. ADO.NET can handle many of these issues for you – at least to some extent. Remember that when changing hierarchies, you'll need to add parent rows first, add children last and delete children before you delete parents. By judicious use of the DataRowVersion states you can filter just those rows to change and use two update phases to make hierarchical changes.
  • Data Integrity: Just because your client code accepts the data, it does not mean your DBA will. A good database administrator will add code to test each column and row using defaults, rules and possibly stored procedure logic. These tests ensure the changes your code (or anyone's code) makes are correct and conform to accepted data integrity standards. I drone on and on about this in my books and suggest several strategies to ensure that the data is captured correctly in the first place. One suggestion involves use of Extended Properties to store mask, minimum, maximum, suggested, default and other properties to help manage the data entry process.
  • Business rules: The logic that defines your business is as important as anything in your application. More and more we're seeing this logic move out of the client and into the middle or data tiers. Your client needs to be at least aware of this logic so that it can better capture the data and submit changes that conform to these evolving rules and criteria.
As you can see handling data updates using direct table access can also be a considerably involved task. Let's talk briefly about stored-procedure-based updates and contrast how this approach moves much of this logic server-side where it can be more easily managed by the DBA team.

Stored procedure-based Updates

Database systems that depend on stored procedures to perform updates are becoming more and more common as they yield better code and developer performance. Because much of the logic described above is handled in stored procedures, DBAs can more closely manage and protect their data and ensure that all applications conform to strict business, data, and referential integrity guidelines. As the new version of SQL Server approaches we'll see even more emphasis on server-side logic (such as stored procedures) to handle these issues – but they will be written in any .NET CLR language.

While the DACW can also setup calls to stored procedures to perform updates, ADO.NET expects these stored procedures to be mirror-images of the commands that update, insert or delete a single row in a single table. This means that you're going to have to take over much of the coding yourself to call more sophisticated stored procedures to make changes to the DBMS.

Calling stored procedures yourself can be somewhat challenging as you have to code the Parameters collection. Depending on the .NET Data Provider you're using this might mean defining an individual Parameter object for each parameter defined in the stored procedure or simply those not defined with a default setting in the stored procedure definition. My book discusses several techniques that leverage the copious overloads on the Parameter.Add method that make this process fairly painless. The trick here is to pass in the “right” version of the DataRow data. That is, you'll want to pass the “original” version of the Timestamp to compare with the server-side timestamp. The listing below illustrates how to extract the DataRowVersion.Original from the DataRow being updated to compare to the server-side row.

'
' SqlUpdateCommand1
'
da.UpdateCommand = New SqlCommand(Nothing, cn)
										' Instantiate new DA UpdateCommand
With da.UpdateCommand

	.CommandText = “UPDATE DemoTSConcurrency SET Name = @Name, “ _
	& “State = @State “ _
	& “ WHERE (PID = @PID) AND (TimeStamp = @Original_TimeStamp); “ _
	& “ SELECT PID, Name, State, TimeStamp “ _
	& “ FROM DemoTSConcurrency WHERE (PID = @PID)”

	.Parameters.Add(New System.Data.SqlClient.SqlParameter(“@PID”, _
			System.Data.SqlDbType.Int, 4, “PID”))
	.Parameters.Add(New System.Data.SqlClient.SqlParameter(“@Name”, _
			System.Data.SqlDbType.VarChar, 50, “Name”))
	.Parameters.Add(New System.Data.SqlClient.SqlParameter(“@State”, _
			System.Data.SqlDbType.Char, 2, “State”))
	.Parameters.Add(New System.Data.SqlClient.SqlParameter( _
			“@Original_TimeStamp”, System.Data.SqlDbType.VarBinary, 8, _
			System.Data.ParameterDirection.Input, False, CType(0, Byte), _
			CType(0, Byte), “TimeStamp”, System.Data.DataRowVersion.Original, _
			Nothing))

End With
In this case (to make the example simpler) I hard-coded the UPDATE statement, but this would generally be executed as a stored procedure. Note the second query to return the post UPDATE TimeStamp and other column data. This is necessary in cases where you might not update or own all of the columns.

Stored procedures can also return multiple resultsets. If invoked using the DataAdapter Fill method, each returned rowset generates its own DataTable. No, you're not going to be happy executing stored procedures using the Update method unless they return “1” as the rows affected.

Generally, I write my own logic when updating data using stored procedures. I execute the stored procedure using the ExecuteNonQuery method against the Command object so I can handle multiple resultsets and other output parameters on my own. I expect future versions of Visual Studio .NET will handle this with more aplomb.

These two articles have discussed tips and techniques you're likely to face when working with ADO.NET to query and update databases using the DataAdapter. I hope it has made the transition to Visual Studio .NET easier.


William Vaughn is a well-known author who has worked with mainframe, minicomputer and personal computer systems as a developer, manager, trainer, marketer, support person and writer for over 30 years. His area of specialisation is focused on data access, and especially Visual Basic and SQL Server. Bill frequents conferences all over the world including VSLive, Developer Connections PASS and DevWeek and has written articles for magazines such as SQL Server Magazine and MSDN.

You might also like...

Comments

Contribute

Why not write for us? Or you could submit an event or a user group in your area. Alternatively just tell us what you think!

Our tools

We've got automatic conversion tools to convert C# to VB.NET, VB.NET to C#. Also you can compress javascript and compress css and generate sql connection strings.

“In theory, theory and practice are the same. In practice, they're not.”