Introduction
Large organizations store their data in several different databases, and more often they resort to storing on heterogeneous data servers by choice, as it suits their business practice. However they also have a need to generate reports based on a combination of data coming from these varied sources. Sometimes the data of the enterprise is stored in a fragmented fashion at different locations on the same or different types of data base servers, by vertical or horizontal partitioning, and here again there arises a need to aggregate data from more than one source for generating a report. Sometimes security concerns dictate storing sensitive information on more secure data servers, and the less important ones on servers which generally lack the security features of the higher end products. In these different scenarios resorting to distributed queries becomes a necessity.
Distributed queries query more than a single database in the same server, or query databases on multiple database servers. They are homogeneous distributed queries when the database servers are of the same type (all MS SQL or all Oracle, for example), and are heterogeneous when they access several different kinds of database servers (MS SQL Server, Oracle, IBM Informix). These queries could be Select, Insert, or Update, as well as delete queries. Microsoft SQL Server supports distributed queries with Ole DB as a conduit to external data, whereas you can use ODBC for external data access in MS Access through linked servers.
This tutorial describes an exercise in querying tables in heterogeneous database servers, a MS Access database and an SQL database.
The distributed data
One of MS Access’s strong points is its capability to import data originating from a large number of different kinds of data sources, text files, spread sheet files, and so on. It can establish a live link to the tables by creating linked servers. A step-by-step procedure for bringing in external data using the ODBC interface was described in great detail in a previous tutorial; here, only some key screen shots of this process will be shown.
It is assumed that MS Access is installed (this tutorial used the following version: MS Access 2002 (10.2627)-SP1 on Windows XP Professional platform) on the desktop. Also a SQL 2000 (Version 8.0) Database server XPHTEK is also installed on the same machine. It could be on the network as well, but in this case it is on the same computer. Since the data stays on the servers it is further assumed that the servers are up and running.
{mospagebreak title=Creating Linked Servers}
Linked SQL Server
Create a MS Access database herein called DistributedQ.mdb. This database does not have any objects of its own. To this database import data by means of linking to tables resident in the Northwind database on the SQL 2000 Server. Follow the procedure described in the previous tutorial. Although the previous tutorial considered bringing in data from an MSDE desktop SQL Server, the process is exactly the same.
You begin by going to File–>Get External Data –>Link Tables…. This opens up a dialog window with the title LINK and you can set up the type of item you want to import. Since MS Access uses ODBC to import data from a SQL Server, go to the bottom of the Files of Type drop-down list and click on ODBC Databases(). This will take you to Select Data Source Window (this same window can be accessed by going to Start –>Control Panel–>Administrative Tools–>Data Sources (ODBC)). You follow the procedure, providing all the requirements, and you will finally reach this screen which completes creation of the ODBC Microsoft SQL Server Setup. You may also test the connectivity by clicking the Test Data Source… button.
When you click OK to this screen, a LINK Table dialog shows up with all the tables in the Northwind database. You can pick and choose, as well as multi-select by holding the Control or shift key. When you click out of the LINK window the tables you have chosen should all appear in the Main window of the DistributedQry application, as shown in this picture.
These are just links to the tables in the Northwind database on the SQL Server as shown below. Any changes made to these tables will be reflected in the linked tables. Also when you try access the link tables, you may be required to provide credentials, both authentication and permissions.
You may review the data on the linked tables by just double-clicking them in the Main window of the DistributedQry.mdb as shown for the dbo_Order Details table. For querying you may assume that the tables are on the database file and use the UI in access to query as well as create other database objects such as forms, reports, and data access pages.
{mospagebreak title=Linked MS Access Server}
Again, you can bring in external data by the same process. However this time it is a lot easier since .mdb happens to be one of the built-in types of file, and you can directly browse and find the file as shown here.
When you click on the Link button after choosing the mdb file, in this case Northwind.mdb, you will get the following Link Tables dialog where, again, you can select a single, or multi-select several tables to be linked.
For this tutorial only the Order Details table was linked.
Now with these linked servers, the main window of the DistributedQry.mdb file appears as shown in this picture. The tables linked to the SQL 2000 Servers have the prefix dbo attached to the table name.
{mospagebreak title=Creating the Distributed Query}
This is no different from querying in the Access database normally. If you use the query wizard to create a query, you will see the familiar interface from where you first select the table name from the drop-down and choose the columns you want as shown. The drop-down shows all the linked tables as well as any queries present. In this screen there is already a query created, which can also be seen.
We will use the query designer, however, to create the query as shown in this picture and save it as DistQry. The Order Details belong to the Northwind.mdb database.
The output from this query is shown here.
SQL View of the Distributed Query
An example of the distributed query constructed using any of the methods is used to display its SQL, and the SQL statement appears as shown here. The ones shown in red are contributed by the MS Access database’s linked table and the rest are provided by the SQL 2000 Server.
SELECT dbo_Orders.OrderDate, dbo_Orders.ShipName, [Order Details].Quantity, dbo_Products.ProductName FROM dbo_Products INNER JOIN (dbo_Orders INNER JOIN [Order Details] ON dbo_Orders.OrderID = [Order Details].OrderID) ON dbo_Products.ProductID = [Order Details].ProductID;
Linked Table Manager
This is a nice utility in MS Access shown here which can not only show the linked tables and their sources, but can also be used to update the links in the event the files get moved. You may have to browse to discover the location to which it was moved.
Summary
Microsoft Access with its support for linking to varied databases is very convenient for testing distributed queries. The example chosen was a simple one, but under realistic conditions when there are relational integrity conditions to be satisfied, especially in data modifying queries, great care is necessary. Although distributed queries are good for limited rows returned by the query, if the data transported is large then there could be resource related problems and network bottlenecks. A large number of important issues like security, transactions, and so on were not discussed, but they are outside the scope of this elementary tutorial.