Saturday, January 01, 2005

Avoiding Anemic Domain Models with Hibernate

One of Hibernate's most under-appreciated features is its ability to persist private fields. This feature is useful for avoiding what Martin Fowler calls the Anemic Domain Model anti-pattern, where domain objects (entities) are reduced to "dumb" record structures with no business logic. In an Anemic Domain Model, you lose all the benefits of OOP: polymorphism, data hiding, encapsulation, etc.

The Anemic Domain Model may have originally evolved from EJB CMP, which requires any persistent field to be accessible directly with a public getter/setter. Developers using POJO frameworks like Hibernate often duplicate the same pattern, though, simply replacing the entity beans with POJOs.

This is not just an academic discussion; this has real consequences for the quality of a codebase. (Academically, this is part of the OOP-RDBMS "impedance mismatch"--in particular, that there is no distinction between a setter/constructor call that actually mutates/constructs an object and one that is merely incidental to materializing an existing object's state from persistent storage.) Let's say you're developing a system for issue tracking with a business rule like "anyone can create a ticket or change its status, but only managers can raise it to 'critical.'" A fragment of an Issue object might look like this (some detail omitted to focus on encapsulation/data hiding issues):
public class Issue {

private String m_status;
public String getStatus() {
return m_status;
}
public void setStatus(String newStatus) {
if (newStatus == STATUS_CRITICAL && !getCurrentUser().isManager()) {
throw new SecurityException("critical.requires.manager");
}
m_status = newStatus;
}
}
This looks great until you realize that setStatus(STATUS_CRITICAL) is also going to be called from the persistence layer in materializing an existing Issue that is already critical, not just when making an explicit change through the UI workflow. Since anyone can view any issue, SecurityException will be thrown when a non-manager tries to view an issue that is already critical. We immediately recognize that the persistence layer needs a way to get "privileged" access to set the underlying field directly, bypassing business logic.

The typical workaround is to give up encapsulation and move the business logic into the corresponding service layer object (e.g., stateless session bean) for issue transactions:
public class IssueManager {

public Issue findIssueById(Long id) ;
public Issue newIssue(... fields ...) {
// begin TX
// ... setup new issue
if (status == STATUS_CRITICAL && !getCurrentUser().isManager()) {
throw new SecurityException("critical.requires.manager");
}
issue.setStatus(status);
// ...
// commit TX
}
public void changeStatus(Long id, String status) {
// begin TX, load issue
if (status == STATUS_CRITICAL && !getCurrentUser().isManager()) {
throw new SecurityException("critical.requires.manager");
}
issue.setStatus(status);
// commit TX
}
}
Now, two real consequences are apparent. First, giving up encapsulation leads to cut-and-paste programming, violating the "don't repeat yourself" principle; this increases the risk of error of the business rule not being cut-and-paste again somewhere it's needed. Second, you lose polymorphism; it is now very difficult to have a subclass of Issue with slightly different business rules. (For example, maybe the main Issue has no restriction on setting status, but a specific type of issue has the critical-requires-manager rule.)

It's true that you could have two separate sets of getters/setters in the Issue itself, one that applies business logic and one that allows direct access and is only used by persistence. This would address the polymorphism issue. But if that direct accessors are also public (as EJB CMP requires) then you still lose data hiding; nothing prevents your service layer/transaction scripts from calling these methods directly.

If you're using Hibernate, though, there is a very elegant solution. Hibernate is effectively "privileged" by manipulating bytecode, so it can touch private fields directly. Hibernate gives you two options in the above scenario:
  • You can have two separate bean-style properties linked to the same underlying field, one with private getters/setters and the other with public. The private methods access the underlying field directly, and the public ones apply business rules. This is the preferred approach, but has the downside of verbosity, plus you have to use different property names in HQL (private) and everywhere else (public).
  • Hibernate can also persist fields directly by using the "access" attribute on @hibernate.property and so on. The upside is that this is more concise with only a single public bean-style property, but using access="field" requires the field name to exactly match the private instance variable name; this won't work if you have some kind of Hungarian naming convention like "m_foo". You can do something like access="MyFieldAccessor" where MyFieldAccessor is a custom class implementing net.sf.hibernate.property.PropertyAccessor, implementing your naming convention (mapping bean property names to member var names) but that requires extra effort.
There are other uses for this feature in Hibernate:
  • Primary keys are generally supposed to be immutable by normal business logic, set only within the persistence layer. So, "setId" methods can almost always be private or protected.
  • Collections getters and setters can also be kept private, to preserve data hiding (prevent rep exposure). Otherwise, when business logic can manipulate a collection directly, it's difficult to enforce business rules on the collection elements, or even to ensure the elements are of the correct type. (The latter may partially be addressed by generics in Java 5 and/or Hibernate 3.)
I believe JDO also instruments classes at runtime to get similar privileged access to persistent fields.

7 comments:

CmdrDats said...

Yay! Thanks for this info, this issue has been bugging me like crazy for the last couple of days :)

Anonymous said...

It's a good thing that Hibernate added field access instead of requiring getters and setters, but it is still stuck in the general JavaBeans model, which is very limiting to domain models.
In addition, I believe private access to fields or methods will fail when running under a security manager, which most application servers do.
That was actually the reason I wrote O/R Broker, which allows you to map to fields, properties, multi-argument methods AND constructors, public or private. Complete freedom in domain model design, including dynamic support for inheritance. And you control the SQL for full optimizations.

Nils

Anonymous said...

Great article. I've been puzzled by what to do about this for a while. There's one additional wrinkle though that you don't cover (unless I'm missing something) - Hibernate allows you to generate the data objects directly from a mapping file. In those cases, you risk stomping over your business logic or not being able to update objects by keeping the business logic there.

Anonymous said...

Hey, that's great info. I found this feature after struggling for a while.

I am having a little variation to that problem, where in I would like Hibernate to access the getter to get the current state of the field, which is computed based on some business logic dependent on some other persistent entities. But, sometimes Hibernate throws out LazyInitializationExceptions(illegal access to loading collection). I guess Hibernate does not like accessing other entities when this entity is being initialized. Do you happen to have any experience with this?

Anonymous said...

About what Akshay Rangnekar commented, .NET 2.0 has a feature called partial classes. This means you can have your logic in a file and let hibernate generate the rest in a separate file.

Anonymous said...

There is an alternative to use service layer: the value object...and this is consistent with separation of concerns. The DAO object just get/set all data from database, the value object hide everithing specific to the persistence and keep/add only business info (for example the setStatus with the check) scoped to the entity, and finally the service layer will ensure business behavior that goes beyond scope of the entity. With this approach, generating code of DAO oject makes sense.

Komail Noori said...

Thanks for sharing such a usefull information with us. I definately appreciate this.

Regards,
Komail Noori
Web Site Design - SEO Expert