Some might argue that voter registration records do not contain anything particularly sensitive, like financial information or social security numbers. Typically, voter registration records contain your full name, date of birth, address, and maybe a phone number. The information that was in this database is available to political campaigns so that they can send you reasons that you should vote for their candidates. Political campaigns typically have to pay for this information though.
You might be wondering why this matters if this information is relatively easy to get a hold of and it does not contain anything particular sensitive. That is an understandable opinion, and you likely came to that conclusion after weighing the breach against your tolerance for your information being available. For some, this breach is uncomfortable. As an information security professional, this breach concerns me as many other breaches do because I am troubled by failures of those entrusted with the data of others. With this breach in particular, I want to talk about three failures specifically:
- Availability of the information (this is not availability in the CIA triad sense). In this case, did the information need to be available on the Internet?
- Access controls for the data. It does not sound like there were access controls on the data (any sort of authentication) since it was "publicly available."
- The data was stored in plaintext. Since the data is sensitive, it should have been encrypted just in case other access controls failed (defense in depth, remember?). I am not saying encryption would have made this story a moot point. Rather, it would have made things harder (but not impossible) for anyone who accessed the data to do anything with it.
The Availability of InformationThe Internet is an awesome thing. It allows people to share information, exchange ideas, and cross borders without leaving the house. However, some information and ideas do not need to be accessible to anyone on the Internet. It is not clear whose website was hosting the database, so it is hard to say for sure if the information needed to be posted on an Internet-accessible host. Sometimes, there is not much choice. Someone may have been told to share this database with a third-party, and the easiest thing for them to do was to put it on a machine that was accessible from the Internet. Of course, this is not the best option, but there are ways to make the information harder to access such as putting it behind a page that asks for credentials.
I have talked before about things that happen when there is an inadequate or non-existent security policy in an organization. This is another example of that. Whoever put this database in a place that anyone could access it either did not know that they needed to do that or weighed the risks and decided to do it anyway. As I mentioned above, it is not clear why this information needed to be on the Internet in the first place. If some information inside of your organization needs to be accessible from the outside, you do not need to rely on security through obscurity (just because the URL for a resource is not advertised does not mean it cannot be found). Consider the following:
- You can set up a VPN to allow trusted users from outside of your network to access resources inside of it. Setting up a VPN correctly is not trivial, and perhaps you do not want to have other people install software. There are software packages that emulate a VPN over HTTPS so that you can access internal resources from a browser, but perhaps you do not want to set all of that up.
- You could use a third-party hosting provider (or you can do it yourself with something like ownCloud) if you are sending files. This might not work so well for a database depending on how you need to access the data, but for normal files this could be an easy solution. If you are going to use a third-party provider, I would suggest storing the files in an encrypted container or encrypted archive using a strong password. The password should be sent out-of-band (i.e. not the same way that you are sending the file).
- You could put the file(s) on a webpage, but encapsulate it / them in some sort of encrypted container with a strong password. The idea is similar to the last bullet point.
Access ControlsEven if you hide the data inside of encrypted containers or VPNs, it should still have some sort of access control on it. With encrypted containers, if your password is not as strong as you think or becomes compromised, anyone with access to the file has access to the data. You can contain the breach if you can control who has access to the data. Sure, if an authorized user gives the data to an unauthorized user, you cannot control that. But, you might be able to establish a chain of custody. This requires accountability in addition to access controls, and the decision to implement that accountability depends on the value you place on the data.
Not everyone in your organization (or trusted third parties) may need or should have access to all of the information in your company. For example, Sue from Accounting needs access to the company's financial records, but Bill in Marketing may not. Access controls are one tool to enforce this.
In our example, even if the voter database needs to be publicly accessible for whatever reason, access controls make it harder for unauthorized users to access the data. If the database is accessed through a database server, the access controls can be set on the server. If the database is to be transferred as a dump, it will need to be hosted somewhere that takes care of authentication. If you will transfer it as a dump, then it is important to encrypt the file as most databases dump in plaintext. We will talk about encryption in the next section.
EncryptionEncryption is an important tool to help prevent unauthorized users from seeing your data at rest or in transit. Many of the solutions we have talked about so far also use encryption to augment other protections like access controls. Depending on the data you want to encrypt, there are different solutions available. If you are going for protection against theft of a drive, you can employ full disk encryption solutions such as dmcrypt / LUKS on Linux and BitLocker on Windows. Keep in mind that full disk encryption will not protect you against malware that steals your files or protect the data in transit if you send it somewhere else. The data is available unencrypted to processes on the machine after you type in the password to unlock the drive. This means that if some malware decides to send your files to a server somewhere, the unencrypted versions will be sent.
If you want to protect the data in order to send it somewhere, then you need to encrypt the files themselves. There are a number of utilities available to do this. We will talk about three common ones: 7-zip, GnuPG, and ccrypt. For all of these examples, I am going to encrypt a simple text file to show you how they work. All of these utilities are available on Windows, OS X, and Linux.
Here is our reference file (secret_text.txt):
7-ZipYou might be wondering what a file archive utility is doing in a list of tools to encrypt files. According to its website, 7-zip uses 256-bit AES when you use a 7z (7zip) or zip archive. For reference, the U.S. government deems 192 and 256-bit AES good enough to protect classified information up to Top Secret (PDF).
Let's encrypt our secret text file:
7z -mhe=on -pYouShouldProbablyMakeYourPasswordStrongerThanThis a protected.7z secret_text.txt
Since you never run a command you do not understand, let's break this down:
- 7z is the command line utility for 7-zip. You can use 7-zip File Manager if you want a GUI.
- -mhe=on tells 7-zip to encrypt the header of the archive. If you do not encrypt the header, the file names will be easy to extract. We will take a look at this in a bit. There might be sensitive information in the file name, so it is best to encrypt the header.
- -p<password> specifies the password to use to encrypt the archive
- a (note that there is no hyphen here) tells 7zip to add the files to the archive we will specify next
- protected.7z is what I chose to name the archive. You can name it something else.
- secret__text.txt is the name of the file I want to protect.
You can see that the encryption made the file a bit larger, but it is worth it if the file needs protecting.
Let's see if we can recover the secret text from the archive:
It does not appear we can easily see the contents of the archive. What about the difference in the files when we encrypt the header and when we do not?
You can see in the file without header encryption, the file name is plainly visible where it is not in the file with header encryption. By default, 7-zip does not encrypt the header, so be sure to turn this option on.
To decrypt, the command does not change much:
7z -pYouShouldProbablyMakeYourPasswordStrongerThanThis e protected.7z
- We use -p to specify the same password we used before
- e tells 7-zip to extract from the archive
- protected.7z is the archive we want to extract from
GPG (GNU Privacy Guard) is an open source implementation of PGP (Pretty Good Privacy). You hear about it more when encrypting e-mail or signing data. However, it can be used to encrypt files and supports a number of different ciphers. You can run gpg --version to see what ciphers are supported on your machine:
To encrypt a file, you can use the following command:
gpg -c --cipher-algo AES256 secret_text.txt
- -c or --symmetric tells GPG to use symmetric encryption (a pre-shared key) as opposed to public-key / asymmetric crypto. Since we want to supply a password, we will use this option.
- --cipher-algo tells GPG which cipher to use. The default for this version of gpg is AES128.
- secret_text.txt is the file we want to encrypt. Substitute whatever file you want to encrypt.
GPG will ask you to supply a password twice and will create a .gpg file. You can decrypt this file by issuing the following command:
The name of your gpg file will be different, but it will ask for the password. If it is correct, it will decrypt the file.
ccryptThe final tool we will talk about for encryption is ccrypt. ccrypt uses AES 256. Its operation is similar to the other tools we have used:
ccrypt -e secret_text.txt
- -e tells ccrypt to encrypt the file we specify
- secret_text.txt is the file we want to encrypt
Note that ccrypt encrypts the file in place (meaning it overwrites the original, unencrypted file). So if you need an unencrypted copy of the file, be sure to copy it to another file before you encrypt it with ccrypt.
We cannot easily read this file either:
Decrypting is just as easy:
ccrypt -d secret_text.txt.cpt
- -d tells ccrypt to decrypt the specified file
- secret_text.txt.cpt is the encrypted file from the previous step
A Quick Note on Encrypting Database FieldsSometimes you need to encrypt parts (or all of a database). If you encrypt the database in the ways we have talked about so far, the database software will not be able to read it, and that is not useful. Many databases such as MySQL allow you to encrypt fields in the database. The exact implementation of this will depend on the database software you are using. If there is interest, we can go into this in more detail in another post.
Final ThoughtsThe title of this post mentions apathy. So far, we have talked about ways to protect data, but not much about apathy. As I mentioned in the beginning, I did not see a lot of articles about this data breach. It certainly did not reach the heights of some of the retail breaches over the past few years. This could be to do with the fact that voter registration data is not perceived as sensitive, but I think that is a fallacy. Depending on someone's intent, this data could be used for a number of bad things:
- Identity Theft: Full name, address, and birth date are most of the pieces that someone needs to impersonate you. For example, when you call your bank or credit card company, they might ask for your birth date as a way of verifying who you are.
- Stalking / Harassment: If someone has your name and the town in which you live, they could find your address.
- Phishing: The more someone has on you, the easier it is for them to build a phishing e-mail that you might want to click on.
- Password guessing: Lots of people use significant dates as part of their passwords. An easy date to remember is your birthday.
So is the perceived insensitivity of the data the reason for the lack of coverage or have we become so used to data breaches that we do not bat an eye when a new one is reported? That kind of apathy is dangerous for all of us because it will tell organizations that they will not be held responsible if data on us that they are trusted with protecting is breached.
What do you think? I would like to hear your thoughts on this one.
Thanks for reading!