Friday, August 7, 2009

Retrieving Database Schema Information

ADO.NET allows you to retrieve a great deal of schema information from a database without resorting to writing your own queries to access system tables. There will be times you still need to write your own queries but the majority of commonly used schema information can be retrieved using the GetSchema method of your connection object.

Exactly what's available and the details of how to retrieve it may vary from database to database but the code pattern remains the same. Note that GetSchema is a member of the DbConnection class, thus it is inherited and overridden in each class that is derived from DbConnection, e.g. SqlConnection and OleDbConnection.

So, the first step to using the GetSchema method is to determine what information your database and ADO.NET provider expose. To do that you call the overload of GetSchema that takes no arguments. I’ll use the SqlClient provider for these examples but, as I said, the pattern is the same no matter the provider and no matter the database. Start by creating a new Windows Forms Application project and adding a DataGridView to your form. In the form’s Load event handler, add the following code:

C#

using (SqlConnection connection = new SqlConnection(connectionString))
{
    connection.Open();
    this.dataGridView1.DataSource = connection.GetSchema();
}

VB

Using connection As New SqlConnection(connectionString)
    connection.Open()
    Me.DataGridView1.DataSource = connection.GetSchema()
End Using

Note that you’ll have to provide your own connection string as appropriate for your database. When you run that project you should see your grid populated with information about the schema collections available. Let’s look at each column of that grid.

The first column is named “CollectionName” and it contains values such as “Tables”, “Views” and “Procedures”. Not surprisingly, those collections provide information about the tables, views and procedures in the database. When calling one of the overloads of GetSchema with parameters, the first argument must be one of the collection names from that column.

The second column is named “NumberOfRestrictions”. A restriction is basically a filter for a collection that you can specify to reduce the amount of information returned. For instance, many collections can be filtered by name of the schema they belong to, e.g. “dbo”. Note that one of the collections you can query is named “Restrictions” and it will give you information about the restrictions available for each collection. When calling the overload of GetSchema with two parameters, the second parameter must be an array of restriction values. We’ll look at restrictions more closely later.

The third column is named “NumberOfIdentifierParts”. It contains the number of columns required to uniquely identify a record in each collection.

Let’s look at getting information from a specific collection. One of the most commonly queried collections is Tables, so let’s take a look at that. Viewing the data in our grid after executing the previous code, we can see that the Tables collection has 4 restrictions and 3 identifier parts. Store that information away for future use. For now, edit the code in your form to get the schema for the Tables collection specifically:

C#

using (SqlConnection connection = new SqlConnection(connectionString))
{
    connection.Open();
    this.dataGridView1.DataSource = connection.GetSchema("Tables");
}

VB

Using connection As New SqlConnection(connectionString)
    connection.Open()
    Me.DataGridView1.DataSource = connection.GetSchema("Tables")
End Using

Note that, as mentioned earlier, the first argument to GetSchema is the name of the collection you want to query. Executing that code populates our grid with information about all the tables in the database we’re connected to. The columns are named “TABLE_CATALOG”, “TABLE_SCHEMA”, “TABLE_NAME” and “TABLE_TYPE” and they contain the name of the database containing the table, the name of the schema that owns the table, the name of the table and the type of the table respectively.

Looking at the data, consider what would be required to uniquely identify a record. Recall that the NumberOfIdentifierParts for the Tables collection was 3. Those three identifier parts are the TABLE_CATALOG, TABLE_SCHEMA and TABLE_NAME, all required to uniquely identify a table within a server instance.

Now let’s look at the Restrictions collection, the contents of which are important when querying many other collections. Edit your code once again to query the Restrictions collection:

C#

using (SqlConnection connection = new SqlConnection(connectionString))
{
    connection.Open();
    this.dataGridView1.DataSource = connection.GetSchema("Restrictions");
}

VB

Using connection As New SqlConnection(connectionString)
    connection.Open()
    Me.DataGridView1.DataSource = connection.GetSchema("Restrictions")
End Using

Having executed that code, we see our grid populated with information about all the restrictions that can be applied to schema queries. The first column is named “CollectionName” and contains the name of the collection that the restriction applies to. Note that there are four rows containing the collection name “Tables” and recall that the NumberOfRestrictions for the Tables collection was 4.

The restrictions for the Tables collection are named “Catalog”, “Owner”, “Table” and “TableType” with RestrictionOrder values of 1, 2, 3 and 4 respectively. Those restrictions relate to the TABLE_CATALOG, TABLE_SCHEMA, TABLE_NAME and TABLE_TYPE columns of the Tables collection respectively.

Let’s say that we want to filter the Tables collection by database. When viewing the contents of the Restrictions collection you should have noticed that the restriction named “Catalog” had a RestrictionOrder of 1, meaning that that must be the first filter value we provide. Try changing the previous code to specify the Tables collection and an array of strings containing the name of the database you’re connected to as the value for the first restriction. For instance, if you don’t specify an Initial Catalog in your SqlClient connection string then you’ll be connected to the “master” database.

C#

using (SqlConnection connection = new SqlConnection(connectionString))
{
    connection.Open();
    this.dataGridView1.DataSource = connection.GetSchema("Tables",
                                                         new string[]
                                                             {
                                                                 "master",
                                                                 null,
                                                                 null,
                                                                 null
                                                             });
}

VB

Using connection As New SqlConnection(connectionString)
    connection.Open()
    Me.DataGridView1.DataSource = connection.GetSchema("Tables", _
                                                       New String() {"master", _
                                                                     Nothing, _
                                                                     Nothing, _
                                                                     Nothing})
End Using

Note that a null reference is passed for each restriction that is not to be used. Restriction values must be provided in the order specified by the Restrictions collection. As such, a value must be specified for each restriction up to and including the last one you want to use. Any after that can be omitted if desired though. That means that the above code could be rewritten as:

C#

using (SqlConnection connection = new SqlConnection(connectionString))
{
    connection.Open();
    this.dataGridView1.DataSource = connection.GetSchema("Tables",
                                                         new string[] {"master"});
}

VB

Using connection As New SqlConnection(connectionString)
    connection.Open()
    Me.DataGridView1.DataSource = connection.GetSchema("Tables", _
                                                       New String() {"master"})
End Using

Now, let’s say that we wanted to filter the Tables collection by TABLE_NAME, perhaps to check whether a specific table exists. The Table restriction has a RestrictionOrder of 3 so, to filter by TABLE_NAME, we must provide at least three restriction values. If we don’t want to filter by TABLE_CATALOG and TABLE_SCHEMA then we simply provide null references for those restrictions.

C#

using (SqlConnection connection = new SqlConnection(connectionString))
{
    connection.Open();
    this.dataGridView1.DataSource = connection.GetSchema("Tables",
                                                         new string[]
                                                             {
                                                                 null,
                                                                 null,
                                                                 "SomeTable"
                                                             });
}

VB

Using connection As New SqlConnection(connectionString)
    connection.Open()
    Me.DataGridView1.DataSource = connection.GetSchema("Tables", _
                                                       New String() {Nothing, _
                                                                     Nothing, _
                                                                     "SomeTable"})
End Using

Run that code and the grid will contain only information about the table with the specified name.

Because the GetSchema method returns a DataTable, as I’ve shown, the result can be bound very simply, just like any other DataTable. An example of how it might be commonly used is to display a list of tables in a ComboBox. You could query the Tables collection using the GetSchema method and then display the TABLE_NAME values in the drop-down list.

C#

using (SqlConnection connection = new SqlConnection(connectionString))
{
    connection.Open();
    this.comboBox1.ValueMember = "TABLE_NAME";
    this.comboBox1.DataSource = connection.GetSchema("Tables");
}

VB

Using connection As New SqlConnection(connectionString)
    connection.Open()
    Me.ComboBox1.ValueMember = "TABLE_NAME"
    Me.ComboBox1.DataSource = connection.GetSchema("Tables")
End Using

I’m sure that, given time, you’ll be able to find other uses for the GetSchema method too.

2 comments:

shirisha said...

how do we specify multiple restrictions. for example if i want to get for multiple tables?? how can we specify multiple table name?

jmcilhinney said...

@shirisha, this may be a bit late for you so sorry about that, but I'll answer the question in case others need it. If you want information for multiple tables then you must either call GetSchema multiple times or else don't specify a table name and then filter locally using a DataView.