[Originally posted on Medium here. It’s not clear to me whether to keep running with this platform or to use something more convenient like Medium or Ghost.]

Quite literally me thinking out loud about identifiers and authenticators while also trying out this medium.com blogging thing. Feedback is welcome. I am most certainly still formulating my thoughts on some of this content, particularly around management of internal and external identifiers.

This article assumes a certain degree of familiarity with the underlying material. Technical terms may or may not be defined. Much of this discussion assumes your use case requires you to uniquely identify a Subject.

Identifiers vs. Authenticators

In the process of authenticating and authorizing a Subject, we first need to identify them. Traditionally this is done by having them provide an identifier like a self-selected username, an assigned identifier, or an email address. In this model, the Subject self-asserts their identity during the authentication process.

Since the Subject is self-asserting their identity, we usually want a way to establish trust that they are who they claim to be. This is traditionally accomplished with a password, a shared secret between the Subject and the Identity Provider. If the Subject knows the shared secret for the claimed identity, we assume they are the actual Subject who owns that identity.

Aside: In a properly designed system the Identity Provider does not store the secret itself, instead storing a cryptographically secure hashed version of the password.

Identifiers as Authenticators?

This is possibly contrary thinking and feedback is appreciated.

The division between “identifier” and “authenticator” is not always clear. Imagine a secret shared only between two Subjects. If the secret is known only to them, then the secret itself serves to both identify and authenticate the Subjects to each other.

Likewise biometric data, and this might be taken as especially controversial, serves both purposes. Your fingerprint, under the assumption that it is unique, both identifies you, and can be used to authenticate you. Despite potential concerns with relying on a single factor as both an identifier and an authenticator, this is how we use biometrics today in many use cases.

You might argue that this use case is identification only, not authentication, and that in some use cases identification is sufficient, but I find this to be splitting hairs. I can unlock my laptop with my password or my fingerprint. The result is the same and in my opinion this is both identification and authentication.

External Identifiers vs. Internal Identifiers

It was Ian Glazer’s talk “The Most Forgotten Thing In Identity Management” and article “Identifiers and Usernames” that got me thinking in depth about internal vs external identifiers, how we use them, what problems they solve, and what challenges they introduce. In general decoupling external and internal identifiers makes our lives in IT much easier, and provides a better user experience.

External Identifiers

External identifiers are the identifiers referred to above, the self-selected or assigned “username” being the canonical example. Email address is another commonly used external identifier. External identifiers provide Subjects a relatively easy, memorable, though certainly not infallible way of identifying themselves to a service.

It is important to note that external identifiers may change over time and your service should account for this. The Subject may wish to use a new email address, or they may wish to choose a new username. Providing an optimal user experience means supporting mutable external identifiers.

Consider Sony’s PlayStation Network (PSN) IDs which, until only relatively recently, could not be changed. I don’t have data for this but I imagine it was pretty common for PlayStation users, no longer happy with the PSN ID they chose when they were 14 years old, to want to change their PSN ID. I don’t believe the details are public but the lack of support for changing your PSN ID may have been the result of embedding the PSN ID, an external identifier, in various back-end systems, making it difficult to change and requiring manual intrvention and likely direct editing of database records.

Internal Identifiers

Internal identifiers are used by your digital systems to uniquely identify a digital identity. A classic example is the Unix uid which provides a unique numerical identifier for each digital identity on the system. A modern example would be issuing an ISO / IETF standard UUID to every digital identity. This is an identifier that is not (necessarily) known or visible to your users. The internal identifier is immutable, allowing you to support mutable external identifiers without disrupting the user experience.

Uniqueness

It is worth noting that the Unix uid allows a 1-uid-to-many-identifiers model. While each uid is unique, it can be mapped to more than one username. Each such username is simply an alias for the same identity identified by the given uid. I’m not sure if this is universal to all flavours of Unix and Linux but it was certainly true for the systems I have managed.

It is not clear to me whether this 1-to-many model is a good idea. Should there always be a 1-to-1 relationship between internal and external identifiers or are there use cases that benefit from a 1-to-many relationship? What about many-to-1?

Delete Me?

This is area where I have not completely worked through the details. It seems to me that there are a number of reasons you should never delete an internal identifier, and maybe external identifiers too:

You may need to maintain records for a Subject even if they are no longer associated with your service (regulatory requirements?). You may need to maintain a golden record of some sort.

You want to avoid internal identifier reuse. This is far more obvious and does not require you to maintain other data about the Subject.

Some complications:

  • Right to be Forgotten: what does this mean for internal identifiers?
  • A return user: if a Subject returns to your service, will you link them back to their old internal identifier, in which case, how do you support that process? Even if you give them a new internal identifier, what if your use case requires linking them to their old data?
  • Likewise previously used usernames or email addresses. I believe it is the case that you could allow both of these to be reused if you are confident they won’t link to old data, but there are likely other considerations.

Authenticators

I don’t have much to offer here. I believe the following are all well established:

  • Passwords are bad but unfortunately still necessary in many cases,
  • Enforced password complexity rules and password expiration are doubly bad,
  • Passwords managers are good where passwords are necessary but password manager user experience is not great,
  • Hardware authenticators, platform authenticators, and biometrics are all becoming more popular and passwordless authentication gaining ground as well.