Using a single type to represent dates, times, datetimes with and without timezones can be confusing for developers. It makes the code less clear and harder to understand.
Furthermore, functions to do operations on these time representations, cannot know what information is fake, generated data versus meaningful data. This means that the functionality can be limited and have a higher risk of being incorrect.
On the other hand having more types that match the real life concepts, make it easier for both programmers to model and understand the programs. At the same time making it easier for the libraries to help programmers.
Types in databases
In common databases, there are some types for dates that make a good deal of sense.
10:15:58. That is all the information
the time has. You have a
2016-01-17. And then you have
datetimes, which is…
surprise: a date and a time. E.g.
information for good measure. So a
Date is not a date. Think about that for a second.
But the confusing naming is not the biggest problem. The problem is that programmers are encuraged to use this one-size-fits-all type to store not just datetimes with timezones. Also for datetimes without timezones, dates without time and timezone and time without dates and timezones.
Understanding the data in a program
When a programmer reads code, it should not be unnecessarily hard to understand what is going on.
This gets you not just a Date, but a certain millisecond in time. Midnight at that date. And it adds a timezone too.
The programmer that comes back to read their own code or someone elses code
does not know if the data is supposed to be a datetime within a certain timezone
or if either of these parts were just added as fake “filler” by Javscript’s
the date, the time of day, the timezone.
Looking at the type and the data, it looks like it is a datetime, even though the programmer just needs to represent a simple date.
The libraries’ understanding of what it is supposed to mean
The programmer is not the only one who is limited by the one-size-fits-all Date(time).
Imagine having to represent the date as a string. You might use a library that can
could print an RFC3339 string like so:
In this case the function prints the datetime in RFC3339 as UTC. But hold on, the date is now the 16th of January instead of the 17th! All we originally wanted to store was the 17th of January 2016, but the type also includes time and a timezone. Because of the timezone, that same time in UTC is the day before at 11pm.
If we had a type available representing a simple date (year, month, day), and had used that, the library could be designed to refuse to print an RFC3339 string on the basis that it needs to know the time and timezone.
By using a type that includes extra, fake, filler data (the time of day and timezone), we open ourselves up for bugs. Like with security, where there are attack surfaces, the extra data becomes a “bug surface”.
Making the intention clear
The types are like a vocabulary, and by limiting programmers to just one word for date, time, datetime with timezone, datetime without timezone, it is harder to communicate clearly.
But for a lot of people, that built in library was not suitable. An
alternavy library was made: Joda Time.
This became the defacto standard library for date and time in Java.
Joda Time has a seperate type to store just a date without time and a separate type
for just a time without a date and so on. Here are some of the key types:
LocalDate, LocalTime, Instant, DateTime, DateTimeZone
How Elixir Calendar handles it
Another example is the Calendar library for Elixir. Like Joda Time it does not have the one-size-fits-all Date type
DateTime in a specific timezone
at a specific point in time, for a
NaiveDateTime for just a date and a time
without a timezone, for just a
Time (e.g. 15:25:16),
and finally for a
Date (e.g. 2016-01-20).
In Calendar there is a function to to format a datetime into an RFC 3339 string. The RFC 3339 string needs to contain a datetime and a timezone offset. A unix timestamp contains all of that. So we can parse a unix timestamp number, and then pass the datetime on to a function that formats that.
If you have a date you can parse that as a Date:
That simply gets you a date. What if we try to format that as an RFC3339 string?
We get an error, and that is great! The DateTime.Format.rfc3339 function does not have enough information to generate an RFC3339 string so it raises an error. To generate an RFC3339 string you need to know the date, time and UTC offset. In this case we only had a date, so the time and UTC offset were missing.
In Ecto, an Elixir database wrapper, there are types for dates, times and datetimes. The datetime type does not contain any data telling anything about which timezone the datetime is in. (This is what is called a “naive” datetime in Calendar.)
Ecto also has an automatic timestamps feature that add timestamps in UTC
when a record is updated or inserted. The type used for
inserted_at is the same
as any other datetime fields though. This means that for any given datetime
we do not know which timezone it belongs to.
So even though the programmer can read documentation and find out that inserted_at and updated_at timestamps are UTC, the date returned does not contain this information.
There is a function for string representations of Ecto.DateTime structs as
ISO 8601. ISO 8601 datetimes allows describing the timezone. Ideally for the timestamps
updated_at, there would be a function that format them as ISO8601 strings
and add a Z to convey that the datetimes are in UTC. But because the type is
also representing other datetimes where the timezone is unknown, we cannot
know which datetimes are inserted_at timestamps in UTC and which are not.
So the only reasonable default is to opt for not saying anything about the timezone.
Calecto is a Calendar-Ecto adapter. It knows that
inserted_at/updated_at timestamps set in Ecto
are in UTC.
At the same time Calecto can make use of the various types in Calendar. So when using
Calecto the inserted_at field will be a
Calendar.DateTime struct that contains the information
that the timezone is in UTC. This means that we can use a formatting function
to get a RFC3339 (which is an ISO 8601 profile) that contains the Z telling the world that this
datetime is in UTC.
On the other hand if we use a normal Ecto.DateTime type or a Calecto.NaiveDateTime type for our field, we are saying that we do not know which timezone that datetime is in. Therefore if we tried to use the same formatting function as above, there would be an error message saying that we cannot create an RFC3339 string without knowing the timezone.
There is a way to overcome that though. If for some reason you have a datetime without timezone information and know that it is in UTC, you can promote a naive datetime to a DateTime:
The point is that for Calendar to generate a string that says something about the
timezone, it has to get that information from somewhere. It does not “magically”
pull out a timezone assumption out of thin air. It can come from data: for example
implicitly UNIX timestamps are always in UTC. Or an explicit part of the data: RFC 3339
timestamps always contain timezone offset information.
If the timezone information is not contained in input data, programmers have to explictly tell the library about the timezone.
An example of that is the
NaiveDateTime.to_date_time_utc call. This is explicit
and not something hidden that accidentally happens.
Elixir and other languages
If you are using Elixir, the only library that implements the principles in this article is the Calendar library. That is the only datetime library I can recommend for Elixir.