Kotlin data classes and sensitive information

One of my most favorite features in Kotlin is Data Classes. Pretty much every “Introduction to Kotlin” article begins with how it lets you reduce the boilerplate in simple classes like this:

Java

public class Customer {
   private String name;
   private String email;
   private String password;
   private String company;

   public Customer(String name) {
       this(name, "", "", "");
   }

   public Customer(String name, String email) {
       this(name, email, "", "");
   }

   public Customer(String name, String email, String password) {
       this(name, email, password, "");
   }

   public Customer(String name, String email, String password, String company) {
       this.name = name;
       this.email = email;
       this.password = password;
       this.company = company;
   }

   public String getName() {
       return name;
   }

   public void setName(String name) {
       this.name = name;
   }

   public String getEmail() {
       return email;
   }

   public void setEmail(String email) {
       this.email = email;
   }

   public String getPassword() {
       return password;
   }

   public String setPassword(String password) {
       this.password = password;
   }

   public String getCompany() {
       return company;
   }

   public void setCompany(String company) {
       this.company = company;
   }

   @Override
   public boolean equals(Object o) {
       if (this == o) return true;
       if (o == null || getClass() != o.getClass()) return false;

       Customer customer = (Customer) o;

       if (name != null ? !name.equals(customer.name) : customer.name != null) return false;
       if (email != null ? !email.equals(customer.email) : customer.email != null) return false;
       if (password != null ? !password.equals(customer.password) : customer.password != null) return false;
       return company != null ? company.equals(customer.company) : customer.company == null;
   }

   @Override
   public int hashCode() {
       int result = name != null ? name.hashCode() : 0;
       result = 31 * result + (email != null ? email.hashCode() : 0);
       result = 31 * result + (password != null ? password.hashCode() : 0);
       result = 31 * result + (company != null ? company.hashCode() : 0);
       return result;
   }

   @Override
   public String toString() {
       return "Customer{" +
               "name='" + name + '\'' +
               ", email='" + email + '\'' +
               ", password='" + password + '\'' +
               ", company='" + company + '\'' +
               '}';
   }
}

Kotlin

data class Customer(var name: String, var email: String, var password: String, var company: String)

(Note: This is just an example, please don’t design your model classes like this.)

The nice thing about Kotlin’s Data Classes is that the equals(), hashCode() and toString() methods are automatically generated for you behind the scenes, so you don’t have to write or maintain them yourself. This is super helpful, but let’s see what happens when you print an object of this class:

val customer = Customer("Android", "android@google.com", "Hunter1", "Google Inc.")
print(customer)

Output:

Customer(name=Android, email=android@google.com, password=Hunter1, company=Google Inc.)

What’s happened here is that the toString() method that was automatically generated for us got invoked to pretty-print the values stored in the object, including the user’s password.

This is clearly a problem, anyone who has access to the application’s logs can easily know the user’s password. Now, if you turn off logging in release builds, the problem is solved, right? Well, not necessarily.

Let’s say you set your data object as message to an Exception before throwing it, the crash reporting library that you’ve integrated in your app will most likely call this toString() method and promptly upload the full output to its servers. The most scary thing about this is that you won’t notice it until it’s too late.

Note that this is not a problem unique to Kotlin. This can also happen for any similar Java classes whose toString() has been overridden without hiding the sensitive information. Kotlin’s data classes simply hide this detail which makes it easy to shoot yourself in the foot.

Anyways, as with pretty much everything in programming, there are multiple ways in which we can fix this:

Solution #1

The simplest way to solve this is to simply override the toString() method in your data classes like this:

data class Customer(var name: String, var email: String, var password: String, var company: String) {
    override fun toString() = "Customer(name=$name, email=$email, password=REDACTED, company=$company)"
}

Now if you print the object, the output will not contain the password.

Customer(name=Android, email=android@google.com, password=REDACTED, company=Google Inc.)

This works for simple classes, but imagine doing this for classes with lots of properties, we’ll have the same problem of writing and maintaining utility methods.

Solution #2

In Kotlin, only the properties declared in the constructor are included in the auto-generated methods. So, if you declare the data class like this, the password will not be included in any of the generated methods, including toString.

data class Customer(var name: String, var email: String, var company: String) {
    var password: String = ""
}

...
{
    val customer = Customer("Android", "android@google.com", "Google Inc.")
    customer.password = "Hunter1"
    print(customer)
}

Output:

Customer(name=Android, email=android@google.com, company=Google Inc.)

This is arguably better, but ugly. We can no longer use the constructor to create objects and also if there are several sensitive properties in a class, consumers will have to remember which values to pass in the constructor and which one to set separately.

Solution #3

Instead of overriding the toString() method of all the data classes that hold sensitive information, we can instead do it in a generic wrapper like this:

class ProtectedProperty<T>(var value: T) {
    override fun toString() = "REDACTED"
}

data class Customer(var name: String, var email: String, var password: ProtectedProperty<String>, var company: String)

Now, if you use it in the data class to hold sensitive values, it doesn’t leak them.

{
  val customer = Customer("Android", "android@google.com", ProtectedProperty("Hunter1"), "Google Inc.")
  print(customer)
}

Output:

Customer(name=Android, email=android@google.com, password=REDACTED, company=Google Inc.)

You can even make your logs look cooler by using the UTF-8 Full Block character (U+2588) instead.

class ProtectedProperty<T>(var value: T) {
    override fun toString() = "███████████"
}

Output:

Customer(name=Android, email=android@google.com, password=███████████, company=Google Inc.)

I personally like this solution since it gives you the best of both worlds - hides sensitive information easily and also retains the proper hashCode() and equals() implementations.

Happy coding!