Database Design – How to Model a Letter Transportation Business Context?

database-designdatabase-diagramserd

I am trying to crate entity-relationship diagram for a transport company database. This database should store information about letters, deliveries, moving letters from one warehouse to another, and historical information. It should allow tracking letters, e.g., checking what letters were on certain date in a specific warehouse.

The process I want to model looks like this:

  • letter is collected from sender and it goes to warehouse,
  • then it could be transferred between warehouses,
  • at last, letter is delivered to recipient.

Business rules

  • Letters are sent and received by People, one letter can be sent by one particular Person, and also received by a specific Person.
  • There can be many People in database, if a Person is stored in the database it must be Sender or Receiver of some Letter.
  • Transports are made between different Warehouses and must consist of many Letters, Letters can be a part of multiple Transports (on different Dates)
  • Warehouses store multiple Letters
  • A Letter can be stored in few different Warehouses on different Dates (on ‘01.11’ it is in ‘Warehouse A’, and on ‘02.11’ in ‘Warehouse B’)
  • DateStorage – contains information about when a Letter has arrived to certain Warehouse
  • Transports are being done on certain Date, they depart from one Warehouse and go to another.
  • Letters can be delivered to recipients. If Delivery Attempt fails – there could be more attempts.

Current diagram

This is the database diagram I have created so far:

![Diagram

Relationship between Letters and Warehouses means that when letter is picked up by courier it is put to that warehouse.

Questions and current considerations

Is this diagram correct, what could I improve?

Is it OK that there are to relationship between transports and warehouses (from could be deduced from previous transports and initial warehouse of letter)?

Best Answer

Although you stated that the structure depicted in your diagram is part of a university project, the objective should be to make it as realistic as possible, so I consider that carrying out an interview (or a series of interviews) with the experts of a correspondence company might be very helpful (perhaps indispensable).

In this way, since creating a full and accurate entity-relationship diagram (or any of its derivatives) for this kind of business domain would require a much more in-depth examination (and, thus, a very long series of to-ings and fro-ings), the goal of this answer is to point you in the right direction with respect to the technical methods you might have to employ, presenting an expository analysis that includes my first draft of an IDEF1X model made up of elements that seem feasible, so that you can use it as a reference to capture the meaning of a real scenario.

Central aspects

The way I see it, the aspects of the business domain analysis that require special attention are the Events in which a Letter might be involved; therefore, I will detail bellow a hypothetical context that consists of some supossed Types of Event that appear of relevance (but, naturally, you should confirm, modify or discard all the discussed points on your own).

Considerations

I have listed the elements (brought up in the talk we had via chat) that I consider particularly helpful to work on forwarding the situation:

  • [A]t any time, I would like to know where the letter is - no matter if it is in some warehouse or it is being transported.
  • In my design sender is only the original sender of a letter (client of the company) and receiver (recipient) is a person to which letters is addressed - there is always one sender and one recipient for a letter.
  • [F]or one transport there can only be one courier.
  • One letter can be stored in up to about 10 warehouse. However number of all warehouses is much higher - few hundreds (or maybe even thousands).
  • My idea was to store every client (sender / recipient) in database. So if some person have send or received at least one letter it […] is in database (but if person sends / receives more letters it is in database only once).

Business rules

Then, based on (a) such elements, (b) the content of your question, and (c) some estimations and assumptions that appear viable, I wrote the following formulations that describe the tentative interrelationships between the hypothetical entity types of significance (i.e., a substantial part of the business context rules, which serve the purpose of defining the corresponding conceptual model):

Person, Letter and Address

  • A Letter is sent by exactly one Person (who plays the role of Sender)
  • A Person sends zero-one-or-many Letters
  • A Letter addresses exactly one Person (who carries out the role of Addressee)
  • A Person (performing the role of Addressee) is addressed in zero-one-or-many Letters
  • A Person keeps zero-one-or-many Addresses
  • An Address is kept by zero-one-or-many Persons
  • An Address locates the Sender of zero-one-or-many Letters
  • An Address locates the Addressee of zero-one-or-many Letters

Letter and Event

  • A Letter is involved in one-to-many Events
  • An Event is classified by exactly one EventType
  • An Event is
    • either a Dispatch1
    • or an Exit2
    • or a Transport3
    • or a HalfwayStorage4
    • or a DeliveryAttempt5
    • or a Receiving6

1 Dispatch. The Event that occurs when a certain Person leaves a specific Letter in a precise Warehouse so that it can take part in a transportation process.

2 Exit. The Event that happens when a particular Letter departs from a given Warehouse to start the transportation.

3 Transport. The Event that occurs when a certain Letter is being moved between distinct Warehouses while the transportation is taking place.

4 Halfway Storage. The Event that happens when a concrete Letter is kept in an exact Warehouse in the middle of the transportation.

5 Delivery Attempt. The Event that occurs when a certain Person (playing the role of Courier) tries to deliver a specific Letter to the Person who was addressed in it.

6 Receiving. The Event that happens when a certain Letter is received in the Address that locates the Person who was addressed in such Letter.

Note: Some of these Types of Event (or Event Types) might come about several times with respect to a certain Letter (e.g., a Letter could be involved in multiple Transport occurrences during a whole transportation process); other Types of Event appear to be constrained to a unique occurrence regarding a concrete transportation process (e.g., Dispatch and Receiving). On the other hand, a real business domain may entail more Event Types, and the ones discussed in this answer might simply not apply or could have distinct characteristics, that is why performing an analysis yourself is paramount. Furthermore, the terms I use here are merely explanatory, as they possibly are not equal to those employed in a real organization.

Warehouse and the different subtypes of Event: Dispatch, Exit and HalfwayStorage

  • A Warehouse lodges zero-one-or-many Dispatches
  • A Warehouse is the origin of zero-one-or-many Exits
  • A Warehouse is the destination of zero-one-or-many Exits
  • A Warehouse provides zero-one-or-many HalfwayStorages

Vehicule and the distinct subtypes of Event: Exit, Transport and DeliveryAttempt

  • A Vehicule is utilized in zero-one-or-many Exits
  • A Vehicule is employed in zero-one-or-many Transports
  • A Vehicule is used in zero-one-or-many DeliveryAttempts

Person and the subtypes of Event: Transport, DeliveryAttempt and Receiving

  • A Person (who plays the role of Courier) carries zero-one-or-many Transports
  • A Person (who plays the role of Courier) conveys zero-one-or-many DeliveryAttempts
  • A Person (who plays the role of Courier) participates in zero-one-or-many Receivings

Illustrative IDEF1X model

Consequently, I built an IDEF1Xa model in terms of all the points explained above. It is shown in Figure 1 (make sure to click the link so that you can observe it at a higher resolution):

Figure 1 - Letter Management IDEF1X Model - First Draft

a Integration Definition for Information Modeling (IDEF1X) is a highly recommendable data modeling technique that was established as a standard in december 1993 by the United States National Institute of Standards and Technology (NIST). It is solidly based on (a) some of the theoretical works authored by the originator of the Relational Model, i.e., Dr. E. F. Codd; on (b) the Entity-Relationship view, developed by Dr. P. P. Chen; and also on (c) the Logical Database Design Technique, created by Robert G. Brown.

Event and its subtypes

As demonstrated in said model, there is a supertype-subtype relationship that comes about

  • between Event (the supertype), and
  • Dispatch, Exit, Transport, HalfwayStorage, DeliveryAttempt and Receiving (the subtypes).

The Event superentity type possesses the properties that are common to all its subtypes, i.e., LetterNumber and DateTime (exposed as a composite PRIMARY KEY [PK for brevity]), along with a discriminator, i.e., EventTypeCode, which indicates the exact kind of subtype instance that must be complementing each Event occurrence.

The PKs of Dispatch, Exit, Transport, HalfwayStorage, DeliveryAttempt and Receiving are, at the same time, FOREIGN KEYs (FKs) that point to the Event PK.

Every one of these subtypes displays the properties (or attributes) that apply exclusively to each of them, and some of those properties are FKs that illustrate the relationships that such subtypes have with some of the other entity types included in the model.

In this regard, you might find of help my answers to

Data samples

You did not clarify if the diagram is going to be used as a conceptual platform to construct a logical database structure (with proper tables, columns, column types and constraints) on a particular SQL database management system, but some data samples in tabular form will help to supply a more rounded answer that introduces further possibilities.

An EventType table

A table standing for the EventType entity type would fulfill a “look-up” role, and it may comprise the following rows:

 +-—————————————-+-————————————————-+-———————————————-+
 | EventTypeCode | Name             | Description     |
 +-—————————————-+-————————————————-+-———————————————-+
 | D             | Dispatch         | The event that… |
 +---------------+------------------+-----------------+
 | E             | Exit             | The event that… |
 +---------------+------------------+-----------------+
 | T             | Transport        | The event that… |
 +---------------+------------------+-----------------+
 | H             | Halfway Storage  | The event that… |
 +---------------+------------------+-----------------+
 | D             | Delivery Attempt | The event that… |
 +---------------+------------------+-----------------+
 | R             | Receiving        | The event that… |
 +---------------+------------------+-----------------+

Note that the EventTypeCode column, that has to be constrained as the PK, contains values that are meaningful (and, hence, are highly legible from the end user point of view), which enhances their usage when referenced from columns with FK constraints. At the same time, since the aforementioned values are physically light (as for their size in bytes), they behave quite fast regarding data retrieval processes and optimizes disk space consumption (mainly when pointed to from FKs).

An Event table

Then, let us suposse that a table dennoting the entity type named Event maintains the data points that follow for the Letter that is primarily identified by the Number 1750:

 +-————————————-+-———————————————————————-+-—————————————-+
 | LetterNumber | RegisteredDateTime      | EventTypeCode |
 +-————————————-+-———————————————————————-+-—————————————-+
 | 1750         | 2016-11-12 16:58:12.000 | D             |
 +--------------+-------------------------+---------------+
 | 1750         | 2016-11-15 09:12:05.000 | E             |
 +--------------+-------------------------+---------------+
 | 1750         | 2016-11-15 10:12:05.000 | T             |
 +--------------+-------------------------+---------------+
 | 1750         | 2016-11-15 11:12:05.000 | T             |
 +--------------+-------------------------+---------------+
 | 1750         | 2016-11-15 12:12:05.000 | T             |
 +--------------+-------------------------+---------------+
 | 1750         | 2016-11-15 12:46:01.000 | H             |
 +--------------+-------------------------+---------------+
 | 1750         | 2016-11-17 06:24:07.000 | T             |
 +--------------+-------------------------+---------------+
 | 1750         | 2016-11-17 07:24:07.000 | T             |
 +--------------+-------------------------+---------------+
 | 1750         | 2016-11-17 08:24:07.000 | T             |
 +--------------+-------------------------+---------------+
 | 1750         | 2016-11-17 09:10:13.000 | H             |
 +--------------+-------------------------+---------------+
 | 1750         | 2016-11-18 08:01:01.000 | T             |
 +--------------+-------------------------+---------------+
 | 1750         | 2016-11-18 09:01:01.000 | T             |
 +--------------+-------------------------+---------------+
 | 1750         | 2016-11-18 09:27:12.000 | H             |
 +--------------+-------------------------+---------------+
 | 1750         | 2016-11-19 11:16:08.000 | D             |
 +--------------+-------------------------+---------------+
 | 1750         | 2016-11-19 11:48:03.000 | R             |
 +--------------+-------------------------+---------------+

As shown, that table would be retaining a sequence of all the Events in which Letter Number 1750 was involved during a transportation. Of course, it would keep all the Events connected with all the pertinent Letters, but I only depicted a few hypothetical rows related to a supossed Letter Number 1750 in order to simplify the example. The sort of “chronological” data structure held in this table is commonly referred to as a time series.

The Event.EventTypeCode column, which should be constrained as a FK, holds values that indicate the kind of Event that was registered (in a user-friendly manner, in accordance with the points mentioned in the previous section).

Application programs sharing data from tables representing the subtypes

A table with partial Event information (like the one previously deliberated) can be displayed in, e.g., a page of a web application program that provides a link that redirects to subsequent pages based on (a) the Event Type, (b) the Letter Number and (c) the Date and Time of the Event of interest.

In turn, the subsequent pages will exhibit the full Event information. E.g., if

  • the EventTypeCode is D,
  • the LetterNumber is 1750, and
  • the DateTime is 2016-11-12 16:58:12.000,

then the redirection must take to a page that presents the complete Dispatch row, e.g.:

 +-————————————-+-———————————————————————-+-———————————————-+
 | LetterNumber | RegisteredDateTime      | WarehouseNumber |
 +-————————————-+-———————————————————————-+-———————————————-+
 | 1750         | 2016-11-12 16:58:12.000 |               x |
 +--------------+-------------------------+-----------------+

In case that

  • the EventTypeCode is T,
  • the LetterNumber is 1750, and
  • the DateTime is 2016-11-15 10:12:05.000,

the redirection ought to take to a page that sets out the pieces of information of the whole Transport row:

 +-————————————-+-———————————————————————-+-——————————————-+-—————————-+-————————-+-—————————-+
 | LetterNumber | RegisteredDateTime      | VehiculeNumber | CourierId | Latitude | Longitude |
 +-————————————-+-———————————————————————-+-——————————————-+-—————————-+-————————-+-—————————-+
 | 1750         | 2016-11-12 16:58:12.000 |              w |         x |        y |         z |
 +--------------+-------------------------+----------------+-----------+----------+-----------+

Integrity, consistency and derivability

Evidently, a database structure with these characteristics entails a considerable set of constraints that must be put up to protect the integrity of the data, guaranteeing its consistency with the rules determined at the conceptual level (e.g., the cardinalities of the relationships between entity types/entity type instances).

Some of these constraints can be configured declaratively by way of PK and FK definitions, but others require resorting to a procedural approach —due to limitations in the declarative facilities provided by the major SQL database management systems—.

Among the rules that have to be enforced procedurally is the feature that (1) each Event row must be suplemented at all times by (2) the respective counterpart in exactly one of the tables representing the subtypes, which (3) must “comply with” the value enclosed in the Event.EventTypeCode column —standing for the discriminator—, so the utilization of ACID TRANSACTIONS is convenient to ensure that these circumstances are always met. Other possibility would be employing TRIGGERS, but they tend to make things messy.

And, yes, it would be practical to express and fix as VIEWs several queries that “combine” data from some or all the related tables in order to, e.g., simplify the pertinent data derivation.