How more time types prevent bugs and add clarity

Using a single type to represent dates, times, datetimes with and without timezones can be confusing for developers. It makes the code less clear and harder to understand.

Furthermore, functions to do operations on these time representations, cannot know what information is fake, generated data versus meaningful data. This means that the functionality can be limited and have a higher risk of being incorrect.

On the other hand having more types that match the real life concepts, make it easier for both programmers to model and understand the programs. At the same time making it easier for the libraries to help programmers.

Types in databases

In common databases, there are some types for dates that make a good deal of sense. You have time e.g. 10:15:58. That is all the information the time has. You have a date e.g. 2016-01-17. And then you have datetimes, which is… surprise: a date and a time. E.g. 2016-01-17 10:15:58.

So far so good. Then there is the JavaScript way of handling time and date. In JavaScript there is just one type. It is called Date. Now you might think that a JavaScript Date would represent, well, a date. But actually in JavaScript a Date is a datetime. And it throws in some timezone information for good measure. So a Date is not a date. Think about that for a second.

But the confusing naming is not the biggest problem. The problem is that programmers are encuraged to use this one-size-fits-all type to store not just datetimes with timezones. Also for datetimes without timezones, dates without time and timezone and time without dates and timezones.

Understanding the data in a program

When a programmer reads code, it should not be unnecessarily hard to understand what is going on.

Imagine that you want to keep track of the date when people were born. We get a date as a string in ISO format. This data will be saved in a database. You use JavaScript’s Date.

new Date("2016-1-17")
Sun Jan 17 2016 00:00:00 GMT+0100 (CET)

This gets you not just a Date, but a certain millisecond in time. Midnight at that date. And it adds a timezone too.

But what if the intention was to store just the date, and not the time of day (00:00:00.000)? What if you store that information in the database. How do you know if that is supposed to represent midnight at that date, or just the date? What about the timezone offset? Was that intended by the programmer or is it just something added as by JavaScript?

Do you manually look at a sample of the data and decide that the ones you looked at were at midnight, and then decided that the time part is probably just something that was added automatically by the JavaScript Date functionality?

The programmer that comes back to read their own code or someone elses code does not know if the data is supposed to be a datetime within a certain timezone or if either of these parts were just added as fake “filler” by Javscript’s Date: the date, the time of day, the timezone.

Looking at the type and the data, it looks like it is a datetime, even though the programmer just needs to represent a simple date.

The libraries’ understanding of what it is supposed to mean

The programmer is not the only one who is limited by the one-size-fits-all Date(time). Imagine having to represent the date as a string. You might use a library that can format the JavaScript “Date”. Since the Date has both a datetime and timezone, it could print an RFC3339 string like so: 2016-01-16T23:00:00Z

In this case the function prints the datetime in RFC3339 as UTC. But hold on, the date is now the 16th of January instead of the 17th! All we originally wanted to store was the 17th of January 2016, but the type also includes time and a timezone. Because of the timezone, that same time in UTC is the day before at 11pm.

If we had a type available representing a simple date (year, month, day), and had used that, the library could be designed to refuse to print an RFC3339 string on the basis that it needs to know the time and timezone.

By using a type that includes extra, fake, filler data (the time of day and timezone), we open ourselves up for bugs. Like with security, where there are attack surfaces, the extra data becomes a “bug surface”.

Making the intention clear

The types are like a vocabulary, and by limiting programmers to just one word for date, time, datetime with timezone, datetime without timezone, it is harder to communicate clearly.

Alternatives

In Java (not JavaScript), the built in type was also called Date. But for a lot of people, that built in library was not suitable. An alternavy library was made: Joda Time. This became the defacto standard library for date and time in Java.

Joda Time has a seperate type to store just a date without time and a separate type for just a time without a date and so on. Here are some of the key types: LocalDate, LocalTime, Instant, DateTime, DateTimeZone

How Elixir Calendar handles it

Another example is the Calendar library for Elixir. Like Joda Time it does not have the one-size-fits-all Date type that JavaScript does. Instead it has seperate types for a DateTime in a specific timezone at a specific point in time, for a NaiveDateTime for just a date and a time without a timezone, for just a Time (e.g. 15:25:16), and finally for a Date (e.g. 2016-01-20).

In Calendar there is a function to to format a datetime into an RFC 3339 string. The RFC 3339 string needs to contain a datetime and a timezone offset. A unix timestamp contains all of that. So we can parse a unix timestamp number, and then pass the datetime on to a function that formats that.

"1453303516" |> DateTime.Parse.unix! |> DateTime.Format.rfc3339
"2016-01-20T15:25:16Z"

If you have a date you can parse that as a Date:

"2016-01-16" |> Calendar.Date.Parse.iso8601!
%Calendar.Date{day: 16, month: 1, year: 2016}

That simply gets you a date. What if we try to format that as an RFC3339 string?

"2016-01-16" |> Calendar.Date.Parse.iso8601! |> DateTime.Format.rfc3339
** (Protocol.UndefinedError) protocol Calendar.ContainsDateTime not implemented for %Calendar.Date{day: 16, month: 1, year: 2016}
[...]

We get an error, and that is great! The DateTime.Format.rfc3339 function does not have enough information to generate an RFC3339 string so it raises an error. To generate an RFC3339 string you need to know the date, time and UTC offset. In this case we only had a date, so the time and UTC offset were missing.

Ecto datetimes

In Ecto, an Elixir database wrapper, there are types for dates, times and datetimes. The datetime type does not contain any data telling anything about which timezone the datetime is in. (This is what is called a “naive” datetime in Calendar.)

Ecto also has an automatic timestamps feature that add timestamps in UTC when a record is updated or inserted. The type used for inserted_at is the same as any other datetime fields though. This means that for any given datetime we do not know which timezone it belongs to.

So even though the programmer can read documentation and find out that inserted_at and updated_at timestamps are UTC, the date returned does not contain this information.

There is a function for string representations of Ecto.DateTime structs as ISO 8601. ISO 8601 datetimes allows describing the timezone. Ideally for the timestamps inserted_at and updated_at, there would be a function that format them as ISO8601 strings and add a Z to convey that the datetimes are in UTC. But because the type is also representing other datetimes where the timezone is unknown, we cannot know which datetimes are inserted_at timestamps in UTC and which are not.

So the only reasonable default is to opt for not saying anything about the timezone.

Calecto

Calecto is a Calendar-Ecto adapter. It knows that inserted_at/updated_at timestamps set in Ecto are in UTC. At the same time Calecto can make use of the various types in Calendar. So when using Calecto the inserted_at field will be a Calendar.DateTime struct that contains the information that the timezone is in UTC. This means that we can use a formatting function to get a RFC3339 (which is an ISO 8601 profile) that contains the Z telling the world that this datetime is in UTC.

post.inserted_at |> DateTime.Format.rfc3339
"2016-01-20T15:25:16Z"

On the other hand if we use a normal Ecto.DateTime type or a Calecto.NaiveDateTime type for our field, we are saying that we do not know which timezone that datetime is in. Therefore if we tried to use the same formatting function as above, there would be an error message saying that we cannot create an RFC3339 string without knowing the timezone.

There is a way to overcome that though. If for some reason you have a datetime without timezone information and know that it is in UTC, you can promote a naive datetime to a DateTime:

post.some_naive_datetime |> NaiveDateTime.to_date_time_utc |> DateTime.Format.rfc3339
"2016-01-20T15:25:16Z"

The point is that for Calendar to generate a string that says something about the timezone, it has to get that information from somewhere. It does not “magically” pull out a timezone assumption out of thin air. It can come from data: for example implicitly UNIX timestamps are always in UTC. Or an explicit part of the data: RFC 3339 timestamps always contain timezone offset information. If the timezone information is not contained in input data, programmers have to explictly tell the library about the timezone. An example of that is the NaiveDateTime.to_date_time_utc call. This is explicit and not something hidden that accidentally happens.

Elixir and other languages

For people choosing a datetime library or considering building a datetime library this information is useful in any language. If you ever build a timezone library look at how it is done in JavaScript to see what NOT to do.

If you are using Elixir, the only library that implements the principles in this article is the Calendar library. That is the only datetime library I can recommend for Elixir.

If you liked this post you might want to follow me on twitter for updates on new posts and more. Twitter handle: @laut