All the Shapes of Your Data

Picture of a butterfly

One of the first tasks in designing an app is figuring out the shape of your data. It’s also one of the more challenging parts to get right.

Most college Computer Science curricula include an entire class on Data Structures. In this blog post, I won’t tackle all the complexities. I’ll focus on a specific notion that comes up a lot in mobile app development – that your data will take several different forms as it moves between the different parts of your app.

Here and There

Here’s a common mobile app scenario: the app communicates using JSON (over REST or GraphQL) with a backend server to synchronize data, which it then caches to a local database – on iOS this is usually in Core Data. (Realm and Firebase are also local database alternatives which may or may not be used with their own backends.) The presentation layer of the app reads the data from the local database to display to the user and to manipulate it, writing it back to the server using JSON again, and to the local database as well. (The order of those operations can vary depending on whether you allow offline modifications, but the concepts are the same.)

The examples here are in Swift for iOS, but the principles are universal.

My Home Town – Shape 1: The Domain Model

Let’s use the classic To-Do List example app, since it’s one that many developers are familiar with. Here’s how a to-do item looks as a Swift struct:

struct ToDoItem {
    let id: UUID
    let title: String
    let categoryID: UUID
    let createdAt: Date
    let completedAt: Date?
}
struct Category {
    let id: UUID
    let title: String
}

This should be fairly easy to understand. It contains:

  1. An ID which we will use to refer to the item within the database and the REST API.
  2. The title. For example, “clean the house”.
  3. The ID of a category to which the item belongs. The categories are described by a separate data structure.
  4. The date the item was created.
  5. The date the item was completed, which can be nil if it hasn’t been done yet.

One thing to notice is that we refer to the item’s category using its ID, and not an object reference. This is for performance and robustness, and I’ll address the reasons for it below.

This struct represents the form of data that you use for business logic and probably much of the user interface. The name for this type of object is Domain Model (DM). The name, like much of the terminology in Software Architecture, is not universally agreed upon. For this discussion, it will do.

On the Road Again – Shape 2: The Data Transfer Object

Next we consider how your data looks when it’s going over the network. This Data Transfer Object (DTO) represents the JSON encoding of the object. Since we’re using Swift, for this simple case we can just add Codable compliance to the struct and it will be automatically convertible to JSON.

struct ToDoItem : Codable {
    let id: UUID
    let title: String
    let completedAt: Date?
    let createdAt: Date
    let categoryID: UUID
}

In this case our Domain Model is performing double-duty as the Data Transfer Object.

In other languages or in more complicates scenarios, there may be fields that need converting between the JSON format and the domain format and so you’ll need to create a separate structure for JSON conversion.

You may also have more than one DTO for a DM. This is common when using GraphQL, which allows you to request a subset of fields. For example, imagine you want to request just those to-do items that belong to a certain category and that are not completed. In that case, there’s no need to include completedAt and categoryID in the received JSON, so that DTO may look like:

struct ToDoItemSimpleDTO : Codable {
    let id: UUID
    let title: String
    let createdAt: Date

    func toDomainObject(categoryID: UUID) : ToDoItem {
        return ToDoItem(id: id,
                        title: title,
                        completedAt: nil,
                        createdAt: createdAt,
                        categoryID: categoryID)
    } 
}

Notice that it includes a function to convert the simple DTO to the full domain model. The function passes nil as the completedAt parameter because we know that’s the case from the query we performed. It takes the categoryID as a parameter because the struct doesn’t know about the category, but presumably the calling code does.

You may choose to have separate structs for your DTOs even in the simple case, as part of your architecture design – it’s a tradeoff between the two engineering goals, the Single Responsibility Principle and Don’t Repeat Yourself. Most architectural decisions are tradeoffs and there’s no one-size-fits-all approach. Just be sure to validate any DTOs from untrusted locations before using them as Domain Models.

Memories – Shape 3: The Database Entity

Once you have the data from the server, you often want to cache it locally in a database. Now your data takes on yet a third shape: the Database Entity. (The terminology for this data shape is not as well established, so I’m borrowing the “Entity” term from Core Data. You may also see it called an Active Record.)

If you’re using Core Data, you define this object in a .xcdatamodel file created in XCode’s graphical editor. This automatically generates a class derived from NSManagedObject which you use to perform your database queries. (Realm has something similar where you derive a class from Object.)

In most cases you should not use those objects directly in your application’s business or UI logic, but should provide methods to perform mapping between your Database Entities and your Domain Models. This helps avoid situations where data changes don’t get reflected in the UI, or worse, when they only partially show up. Writing presentation code where the data it uses can change at any time is a recipe for bugs.

A note about object relations

Core Data, like most other database access layers, provides a way to set up object relations and to do lazy fetching of the related objects. I have found that using this functionality can lead to more problems than it solves, so I recommend pre-fetching any related objects that you need into an immutable array and using them that way. For an in depth exploration of this issue, see this explanation for why Android’s Room framework chose not to include object relations in this manner.

Other Shapes

Depending on your app, there may be other shapes that your data may take. For example, you may aggregate multiple models into one. If you wanted to query a to-do category along with all its items, you may end up with a struct like:

struct CategoryAndItems {
    let id: UUID
    let title: String
    let items: [ToDoItem]
}

This is a special kind of Domain Model.

Or in your user interface or testing code you may need simplified versions of any of the above shapes. If you’re using an MVVM architecture, your View Models may have its own way of representing data, or it may reference the DM directly.

And of course if you are creating your own backend server, it will have its own way of representing the data.

Changes – Converting Between the Shapes

You will of course need to convert between the various shapes. For example, when you receive a payload of items from the server, you need a function to convert the DTOs into Database Entities.

It’s up to you where to put this function – it could be a member function in an extension of the DTO, a constructor on the Database Entity, or a standalone function. The important thing is to keep these “glue” functions separate from the actual business logic. You want to be able to test the DTO without having to depend on the DM or Entity, and likewise you want to be able to test your domain code without reference to the database or the network.

Optimizations

You may look at this explanation, and the fact that there are at least three separate ways of representing each piece of data, and feel like it’s a bit much. Every time you add a new entity you need to add it in three places, plus all the places where it gets converted. That can add up to a lot of work.

That’s all true. Still, for the purest architecture, that’s what you should do. I recommend at least giving it a try and see the benefits you get from the clean separation.

On the other hand, there are ways to make it easier on yourself. As you saw in the first example, you can use one data structure to play more than one role – your Domain Model can double as your Data Transfer Object. If the shapes line up pretty closely this can work in practice.

Likewise, you can make your Database Entity do double duty as a Data Transfer Object. In one of my projects, I created a CodableManagedObject base for my Core Data objects which automatically and dynamically adds Codable compliance to NSManagedObject. It works for that project because I store the objects in the database pretty much the same way they arrive over the REST interface.

One thing I don’t recommend is using the same object for Domain Model and Database Entity. You really want your domain objects to be immutable in your presentation code and business logic.

There is a lot more to be explored in this area of data representation, but this should give you a good start. The important thing is to take the time to think, “Where does this belong?” before implementing any logic in your app. Make sure you have a clear idea of your architecture’s separation of concerns.