It seems like a dying art, but I still strongly feel that Entity Relationship Diagrams (ERD) should be the starting point of all software development projects. Since they are for me anyway, I wanted a place to refer colleagues to for how to read these diagrams, and an Entity Relationship Diagram Example seemed like a great place to start.
The Example: A Resource Management Application
Consider that we’re writing a resource management application. The first step to creating an ERD is always to identify the nouns (entities). In this case let’s start with:
Here’s the Example Entity Relationship Diagram I’ll decipher piece by piece in this article (click to enlarge):
(note that I’m now using singular names since my somewhat controversial decision to switch to naming entities in the singular)
To read the notations of an Entity Relationship Diagram:
An Entity Relationship Diagram conveys a lot of information with a very concise notation. The important part to keep in mind is to limit what you’re reading using the following technique:
The set of symbols consist of Crow’s feet (which Wikipedia describes as looking like the forward digits of a bird’s claw), O, and dash, but they can be combined in four distinct combinations. Here are the four combinations:
Zero through Many
If, as in the diagram above, the notation closest to the second entity is a crow’s feet with an O next to it, then the first entity can have zero, one, or many of the second entity. Consequently the diagram above would read: “A company can have zero, one, or many employees”.
This is the most common relationship type, and consequently many people ignore the O. While you can consider the O optional, I consider it a best practice to be explicit to differentiate it from the less common one through many relationship.
One through Many
If, as the next diagram shows, the notation closest to the second entity is a crow’s feet with a dash, then the first entity can have one through many of the second entity. More specifically it may not contain zero of the second entity. The example above would thus read (read bottom to top): “A Project can have one through many Employees working on it.”
This is an interesting combination because it can’t (and for various reasons probably shouldn’t if it could) be enforced by a database. Thus, you will only see these in logical, but not a physical, data models. It is still useful to distinguish, but your application will need to enforce the relationship in business rules.
One and Only One (onne)
If the notation closest to the second entity contains two dashes it indicates that the first entity can have one and only one of the second. More specifically it cannot have zero, and it cannot have more than one. The example would thus read: “An Employee can have one and only one Company.”
This combination is the most common after zero through many, and so frequently people consider the second dash optional. In fact, some ignore both dashes, but I would highly recommend at least using one for clarity so as not to confuse the notation with “I’ll fill in the relationship details later”.
Zero or One
A zero or one relationship is indicated by a dash and an O. It indicates that the first entity can have zero or one of the second, but not more than one. The relationship in the example above would thus read: “A Project can have zero or one Technology Project.”
The zero or one relationship is quite common and is frequently abbreviated with just an O (however it is most commonly seen in a many-to-many relationship rather than the one-to-one above, more on this later).
Relationship Types
Having examined the four types of notation, the discussion wouldn’t be complete without a quick overview of the three relationship types. These are:
One-to-Many
A one-to-many (1N) is by far the most common relationship type. It consists of either a one through many or a zero through many notation on one side of a relationship and a one and only one or zero or one notation on the other. The relationship between Company and Employee in the example is a one-to-many relationship.
Many-to-Many
The next most common relationship is a many-to-many (NM). It consists of a zero through many or one through many on both sides of a relationship. This construct only exists in logical data models because databases can’t implement the relationship directly. Physical data models implement a many-to-many relationship by using an associative (or link or resolving) table via two one-to-many relationships.
The relationship between Employee and Project in the example is a many to many relationship. It would exist in logical and physical data models as follows:
One-to-One
Probably the least common and most misunderstood relationship is the one-to-one. It consists of a one and only one notation on one side of a relationship and a zero or one on the other. It warrants a discussion unto itself, but for now the Project to Technology Project relationship in the example is a one to one. Because these relationships are easy to mistake for traditional one-to-many relationships, I have taken to drawing a red dashed line around them. The red dashed line is not standard at all (although a colleague, Steve Dempsey uses a similar notation), but in my experience it can help eliminate confusion.
Conclusion
I hope you’ve found this a useful example for deciphering and verifying entity relationship diagrams. As always please add any comments, disagreements, thoughts or related resources.
---
8/2/2007 Update
BTW, I now reference this article in all of my data models. If you would like to as well here is the tiny url: http://tinyurl.com/yw6f6e
---
10/29/2015 Update
Liked this? Also check out Business Process Modeling for Software Developers
The Example: A Resource Management Application
Consider that we’re writing a resource management application. The first step to creating an ERD is always to identify the nouns (entities). In this case let’s start with:
- Company
- Employee
- Project; and
- Technology Project (which are a specific type of Project that perhaps require special fields like “number of entities”)
Here’s the Example Entity Relationship Diagram I’ll decipher piece by piece in this article (click to enlarge):
(note that I’m now using singular names since my somewhat controversial decision to switch to naming entities in the singular)
To read the notations of an Entity Relationship Diagram:
An Entity Relationship Diagram conveys a lot of information with a very concise notation. The important part to keep in mind is to limit what you’re reading using the following technique:
- Choose two entities (e.g. Company and Employee)
- Pick one that you’re interested in (e.g. how a single Company relates to employees)
- Read the notation on the second entity (e.g. the crow’s feet with the O above it next to the Employee entity).
The set of symbols consist of Crow’s feet (which Wikipedia describes as looking like the forward digits of a bird’s claw), O, and dash, but they can be combined in four distinct combinations. Here are the four combinations:
- Zero through Many (crow's feet, O)
- One through Many (crow's feet, dash)
- One and Only One (dash, dash)
- Zero or One (dash, O)
Zero through Many
If, as in the diagram above, the notation closest to the second entity is a crow’s feet with an O next to it, then the first entity can have zero, one, or many of the second entity. Consequently the diagram above would read: “A company can have zero, one, or many employees”.
This is the most common relationship type, and consequently many people ignore the O. While you can consider the O optional, I consider it a best practice to be explicit to differentiate it from the less common one through many relationship.
One through Many
If, as the next diagram shows, the notation closest to the second entity is a crow’s feet with a dash, then the first entity can have one through many of the second entity. More specifically it may not contain zero of the second entity. The example above would thus read (read bottom to top): “A Project can have one through many Employees working on it.”
This is an interesting combination because it can’t (and for various reasons probably shouldn’t if it could) be enforced by a database. Thus, you will only see these in logical, but not a physical, data models. It is still useful to distinguish, but your application will need to enforce the relationship in business rules.
One and Only One (onne)
If the notation closest to the second entity contains two dashes it indicates that the first entity can have one and only one of the second. More specifically it cannot have zero, and it cannot have more than one. The example would thus read: “An Employee can have one and only one Company.”
This combination is the most common after zero through many, and so frequently people consider the second dash optional. In fact, some ignore both dashes, but I would highly recommend at least using one for clarity so as not to confuse the notation with “I’ll fill in the relationship details later”.
Zero or One
A zero or one relationship is indicated by a dash and an O. It indicates that the first entity can have zero or one of the second, but not more than one. The relationship in the example above would thus read: “A Project can have zero or one Technology Project.”
The zero or one relationship is quite common and is frequently abbreviated with just an O (however it is most commonly seen in a many-to-many relationship rather than the one-to-one above, more on this later).
Relationship Types
Having examined the four types of notation, the discussion wouldn’t be complete without a quick overview of the three relationship types. These are:
- One to Many
- Many to Many
- One to One
One-to-Many
A one-to-many (1N) is by far the most common relationship type. It consists of either a one through many or a zero through many notation on one side of a relationship and a one and only one or zero or one notation on the other. The relationship between Company and Employee in the example is a one-to-many relationship.
Many-to-Many
The next most common relationship is a many-to-many (NM). It consists of a zero through many or one through many on both sides of a relationship. This construct only exists in logical data models because databases can’t implement the relationship directly. Physical data models implement a many-to-many relationship by using an associative (or link or resolving) table via two one-to-many relationships.
The relationship between Employee and Project in the example is a many to many relationship. It would exist in logical and physical data models as follows:
One-to-One
Probably the least common and most misunderstood relationship is the one-to-one. It consists of a one and only one notation on one side of a relationship and a zero or one on the other. It warrants a discussion unto itself, but for now the Project to Technology Project relationship in the example is a one to one. Because these relationships are easy to mistake for traditional one-to-many relationships, I have taken to drawing a red dashed line around them. The red dashed line is not standard at all (although a colleague, Steve Dempsey uses a similar notation), but in my experience it can help eliminate confusion.
Conclusion
I hope you’ve found this a useful example for deciphering and verifying entity relationship diagrams. As always please add any comments, disagreements, thoughts or related resources.
---
8/2/2007 Update
BTW, I now reference this article in all of my data models. If you would like to as well here is the tiny url: http://tinyurl.com/yw6f6e
---
10/29/2015 Update
Liked this? Also check out Business Process Modeling for Software Developers
Comments
Mike Kow
Yes, good point, you can accomplish the same thing with UML class diagrams.
I would argue, however, that an ERD is simpler and better designed for a non-technical audience. I always walk my customers through an ERD prior to moving from analysis to design, and they usually learn how to read it and spot mistakes within minutes.
However, I will research this more and will post on it next.
As pointed out by the above anonymous coward, class hierarchies are usually shown using UML notation, but UML is usually not used for database schemas (unless you begin "overloading" the UML notations to bind their database equivalents (e.g. - and + could denote "table metadata" vs. "real application data", respectively.)
The main difference between the two are that ERs are best for detailing structured data, whereas UML has notations for behaviors (methods) in addition to data, making them more suitable for modeling class hierarchies (who have both state and behavior).
It might also be good to detail the other "edge notations" (such as Chen notation, using 1:N, M:N, etc.) as those are still commonly in use in many places.
Here is the original paper from the ER OG Peter Chen. http://csc.lsu.edu/news/erd.pdf
...and here is Chen's web page: Peter Chen is teh Man (Credit where credit's due *wink*)
Thanks for your post man, I'll have to break my response up.
---
"you cast the ER diagram in the light of a class hierarchy"
I assume you're referring to my use of Project and TechnologyProject to demonstrate a one-to-one relationship. I would argue that this is the correct way to model for a database AND IS THE EXACT SAME THING as an inheritance structure in OO. i.e. 1-1 in your DB = inheritance in your code. This is simply good normalization.
---
"I find the most suitable use for an ER diagram is when doing database schema design"
While I agree that ERD is the starting point for your DB if you have one, I also use it as the starting point for my UML class diagrams, sequence diagrams, etc. And also just to get all stakeholders using the same language. But if you've done your job right, the class diagrams and ERD's should be nearly identical anyway.
---
"Also, aren't relationship-entities drawn as a diamond?"
I don't believe this is true when using the crow's feet notation. Regardless I've never seen anyone use the diamond, but I could see how it would be helpful.
---
"Credit where credit's due"
Thank you. I should have cited Peter Chen to begin with even though he doesn't use the crow's feet notation that I like so much.
Thanks!!
Eric H
Miami of Ohio '94
Xavier '04
Ex:
Set EmployeeID in the PROJECT table to NOT NULL with a foreign key constraint pointing to the EMPLOYEE table.
Am I missing something?
Sure, physically they are. But that's the beauty of ERD's. They abstract away implementation details and just focus on the important parts: Entities and their relationships. And in the process removes numerous arguments like surrogate vs. natural keys.
i am just 16 ans willing to make an online network! it will help me a lot
I'm accountancy student in Singapore(NTU). I'm preparing for my exam-Business IT. I find the examples very useful and easy to comprehend. Thanks =)
Regards
I was reading this, and I was hoping to get an answer to:
What should you put in the Crow's style ERD when you have many-to-many relationships: the logical diagram or the physical diagram of it?
When you have a many-to-many relationship is good to just leave it specified in order to get the "big picture", but this "big picture" does not allow you to specify attributes on the resulting physical table as you would in the original entities of the diagram. So, if you are to draw a many-to-many relationship in an ERD following the Crow's style, how would you do it?
Thanks.
I see that there are no generalizations structure (like a Doctors is a Person), how would you graphically describe one?
Cheers!
The picture under "Many-to-Many" answers your question of "how would you do it?" As for whether to do logical or physical, if I do a facilitated session with an end user or customer I usually use a white board and draw a logical many to many. But when I implement that in Visio at the end of the day I always use physical since 1. Visio doesn't support logical many to many constructs and 2. Since I use Visio to forward engineer my database (i.e. generate DDL).
I would consider the "Project" to "Technology Project" to be what you describe as a "generalizations structure." In any event it is an "is a" type of relationship. Basically I tend to implement class inheritence with one-to-one relationships. I hope that answers your question.
- A L Narayana
What about representing hierarchies? Also, it seems that Visio does not support this feature. How can you workaround this?
Daniel
I assume you mean self-referential relationships? Like an employee-manager relationship? Yea, I guess you're right the article didn't cover it, but it's pretty straight forward. In Visio just add a link from a table to itself. Usually Visio doesn't know what foreign key to use, so you have to add a new column like ManagerId and then click on the relationship and manually associate it with EmployeeId. Should be pretty straight forward.
Very clearly and simple.
republic polytechnic sucks!
Thanks a lot
Phani Kumar Palika
I need to make a job search application can you guide me for the databse design.
BTW, I have the same doubt as leonard, does logical-physical conversion match each other?
You're absolutely right that the many-to-many resolution lost information in it's implementation, namely that a project must have one employee. That is actually a point I mentioned earlier in the section "One Through Many". Specifically:
"This is an interesting combination because it can’t ... be enforced by a database."
So essentially this is a business rule that will need to be enforced by your application tier / UI.
thankx
Shabaz_ch@hotmail.com
I am happy to see that you place a high value on the conceptual schema since as you've pointed out it is the glue which can drive your persistence, object oriented framework and business rules (inference) layers.
For a full exposition see Conceptual Modeling of Information Systems, Antoni Olive or Information Modeling and Relational Databases, Terry Halpin. Two excellent texts which address these issues.
it is very usefull for beginers.
Sadly, I couldn:t find your writing while I was in my database class...
It seems to cover much of the more formal information processing lacking in the details of database writing.
As you say; a diagram covers a thousand words, but I was wondering how yourself might start to draw a relationship seeing as you've mentioned a whiteboard.
Would you start with a line and then draw crows feet on one side of an entity to signify a many to one relationship, say?
After this, adding mandatory conditions by empty or solid circles or by ticks?
On paper, how would one clearly cross out a subtle tick or circle if the customer disagrees with the interpretation for their buisness.
Sincerly Yours,
A more better one would the notation devised by Michael Senko.
Thanks
A clear explanation that covers the essence of ER modelling (other authors take 30+ pages to tell the same).
I may come back to this later.
best regards
GN
I am a newbie into the world of databases. Could you help me given the scenario described below:
Consider the following set of requirements for a university database that is used to keep track of students' transcripts.
a. The university keeps track of each student's name, student number, social security number, current address and phone, permanent address and phone, birthdates, sex, class (freshman, sophomore, ... , graduate), major department, minor department (if any), and degree program (B.A., B.S., ... , Ph.D.). Some user applications need to refer to the city, state, and zip code of the student's permanent address and to the student's last name. Both social security number and student number have unique values for each student.
b. Each department is described by a name, department code, office number, office phone, and college. Both name and code have unique values for each department.
c. Each course has a course name, description, course number, number of semester hours, level, and offering department. The value of the course number is unique for each course.
d. Each section has an instructor, semester, year, course, and section number. The section number distinguishes sections of the same course that are taught during the same semester/year; its values are 1,2,3, ... , up to the number of sections
taught during each semester.
e. A grade report has a student, section, letter grade, and numeric grade (0, 1, 2, 3, or 4).
Question:
Design an ER schema for this application, and draw an ER diagram for that schema. Specify key attributes of each entity type, and structural constraints on each relationship type.
Note any unspecified requirements, and make appropriate assumptions to make the specification complete.
First of all I would like to thank Lee. He has pointed a very useful and important topic about ER diagram, I found what exactly I was looking for and searching. I am doing my bachelor in CS and I think I will need more help .
the clearer the better...
keep up!...
1) his usage of ‘can’ to mean either ‘may’ (see example 1 ‘zero thru many’ and example 4 ‘zero or one’) or ‘must’ (see example 2 ‘one thru many’ and example 3 ‘one only’) is misleading and bewildering, especially for those relatively new to data modelling
2) his usage of the following symbols is both redundant and contrary to accepted data modelling practice:
- the double dash to denote the mandatory existence of one entity occurrence involved in a relationship with one or more other entity occurrences; instead, the use of a single dash without the crows feet symbol (used to indicate multiple occurrences) next to it is the norm
- the use of a single dash to denote a single entity occurrence whether optional or mandatory; data modellers typically just leave out the crows feet symbol to do this and only use (see comment above) the single dash to indicate that the occurrence is mandatory, not if it’s optional
- the use of ‘O’ to denote only a zero entity occurrence; the normal practice is to use ‘O’ only to indicate whether the occurrence or occurrences of that entity are optionally (i.e. zero, one or more occurrences) related to another entity
3) his failure to mention that the presence of an associative or link table in many-to-many type relationships in a physical model is also common practice in the design of a logical model (where tables are called entities and where such entities may contain unique attributes)
4) his use of a broken rather than an unbroken line linking related entities in the examples could do with explanation – why?
-an intuitive student from Philippines
Abhishek N.
thanks for that.