Choosing the Right Entity Framework Workflow
Many organizations use the .NET Entity Framework to create database applications because of the automation it provides and because developers can create a model that everyone can understand using tools provided as part of Visual Studio. However, some overlook the fact that the Entity Framework supports three workflows: code first, model first, and database first. Using the right model can save developers a lot of time and effort, especially when creating a complex database design.
Databases can become incredibly complex—to the point that no one really understands the entire structure of the database used for a mission-critical application. Managing these behemoths is difficult. However, when you’re part of the development team responsible for an application that relies on the database, the task seems downright impossible without a lot of help (and it still is incredibly hard even then). That’s why Microsoft created the Entity Framework: to reduce complexity and make it possible for developers to write great applications without having to become Database Administrators (DBAs).
The Entity Framework is an object-relational mapper. That’s a fancy term, but what it means is that the Entity Framework takes the structure of the database and turns it into objects that the .NET Framework can understand. A developer uses those objects to interact with the database instead of interacting with the database directly. It’s possible to perform the full set of Create, Read, Update, and Delete (CRUD) operations using the Entity Framework features. In addition, the Entity Framework tools make it possible to perform tasks such as adding new tables or creating a new function. The Entity Framework makes developers becomes more efficient, since it interacts with the database and the developers can use familiar objects.
Creating a method to interact with the database is fine, but there are different scenarios under which a developer may have to make modifications to the database. For example, the database may only exist as coded classes at the outset and the organization may need those classes transformed into an actual database. (Perhaps the data was stored in XML files previously, but the data grew so large that a true database, such as Microsoft SQL Server, is required now.) The Entity Framework lets you handle these scenarios using three workflows: code first, model first, and database first. The word in front of “first” should give you some idea of how the workflow works, but it really is important to choose the correct workflow or you may find yourself doing a lot of rework later.
Understanding the Model First Workflow
The model first workflow was originally introduced as part of the Entity Framework 4.0 to make it possible for a developer to use a designer to create a database from scratch. The designer lets you visually define the database using an approach much like the technique for creating forms for applications. You select database elements from the Toolbox, perform some configuration, and then rus a few commands to create the database. It’s a little more complex than that, but not much. From an ease of design perspective, the model first workflow is definitely the way to go.
When using the model first approach, the designer takes over the task of creating the classes that interact with the database. The designer relies on the .EDMX file it generates to maintain the design specifics. You can effect the output of those classes through configuration changes, or modify the design directly by through the .EDMX file or by creating extensions to the design. These are advanced techniques though and tend to become quite complicated after a while; it’s often a lot easier to use one of the other workflows to overcome the limitations of this approach.
Most developers use the model first workflow on new projects where there’s no existing database or code base. One benefit of using this approach is that it makes it easier to help others see the design as you put it together. The designer provides a prototyping tool of sorts that can provide understandable output for meetings with people who wouldn’t have the skills required to understand code, but who can understand a block diagram. The overall advantages of this approach are speed of design when working with a new database and the ability to communicate with non-technical groups.
Understanding the Code First Workflow
The code first approach, part of the Entity Framework 4.1, was the last workflow Microsoft introduced. It lets you transform your coded classes into a database application, with no visual model used. Of the three workflows, this approach offers the most control over the final appearance of the application code and the resulting database. However, it’s also the most work. And it presents the biggest obstacles to communicating well with non-developers during the initial design process.
With the code first workflow, you also need to write glue code in the form of mapping and, optionally, database configuration code. However, even in this respect, the workflow provides developers with significant advantages in flexibility. You know precisely what is going on with the underlying code at all times, which is a huge advantage when working with large scale systems. The cost of this knowledge is equally huge; it takes considerably longer to develop the application (there’s no free lunch).
A code first workflow is the only realistic solution when an organization decides on a code-centric approach for an existing database. However, in this case, the developer must reverse engineer the database and create classes that reflect the database design as it exists. Microsoft does provide some assistance to the developer to perform this task in the form of the Entity Framework Power Tools, but you should expect to still end up tweaking the code to precisely match the application requirements and the underlying database.
In many respects, code first is the least useful workflow for large development teams because the lack of automation can produce consistency errors across groups. The resulting model takes more time to tweak. When using this approach, you need a strong centralized management effort to reduce the costs associated with a code-centric approach. However, when trying to integrate disparate databases (such as after an acquisition) this approach does offer significant benefits because it’s more flexible and controllable.
Understanding the Database First Workflow
The Entity Framework was originally designed to make working with existing databases easier. As a result, the database first workflow is the most polished of the three options. Given an existing database, the Entity Framework can analyze it, provide you with options for importing part or all of the structure, and then create the requisite model automatically. The underlying objects are automatically generated as well. All the developer really needs to worry about is creating the application itself; the database access is pretty much handled by the Entity Framework. In general, this feature of the Entity Framework works incredibly well and is quite fast – much faster than any developer could even contemplate doing the job.
The most significant advantage of the database first workflow (besides being incredibly fast) is that it’s consistent. Having a means of producing consistent modeling is important in a large team setting. Various team members can work on parts of the database and the resulting model will still go well together because it was produced in a consistent manner in the first place.
Be aware, though, that this is also the least flexible method of creating the underlying objects used by the .NET Framework as part of your application. The development team gains the least knowledge of precisely how things work. In fact, the objects begin as a black box that could cause you problems later if the Entity Framework encounters some oddity in the original database. To make changes to the underlying objects, you also have to rely on working with extensions, rather than modify the code directly, because the automation overwrites any changes you make otherwise. This leads to problems of figuring out just which file to look in for changes.
Choosing a workflow is a matter of defining precisely how you plan to interact with the database and taking stock of the kinds of your resources. It’s also essential to consider issues such as the amount of flexibility required to interact with complex database designs.
In many cases, you’ll find that you actually need to combine workflows to obtain the best result. For example, if you have a lot of existing code, yet need to work with existing data as well, you may need to combine the code first and database first approaches. However, when conflicts exist, you may actually need to use the database first approach to import the existing structure and then rely on the model first workflow to overcome the differences between the existing applications and the database.
It’s also important to remember the source of conflicts most generally lie in consolidation. Whether a company has downsized and now needs to combine departments or it made an acquisition and needs to integrate the other organization’s data into the existing database, the source of the problem is the same. Achieving a consolidated, yet effective, data model is essential before you can make any real progress in managing it.
Did you find this post useful? Have anything to add? Let us know in the comments below.