ONLamp.com    
 Published on ONLamp.com (http://www.onlamp.com/)
 See this if you're having trouble printing code examples


FreeBSD Basics Cryptographic Terminology 101

by Dru Lavigne
10/31/2002

In the next few articles, I'd like to concentrate on securing data as it travels over a network. If you remember the IP packets series (see Capturing TCP Packets), most network traffic is transmitted in clear text and can be decoded by a packet sniffing utility. This can be bad for transmissions containing usernames, passwords, or other sensitive data. Fortunately, other utilities known as cryptosystems can protect your network traffic from prying eyes.

To configure a cryptosystem properly, you need a good understanding of the various terms and algorithms it uses. This article is a crash course in Cryptographic Terminology 101. Following articles will demonstrate configuring some of the cryptosytems that are available to FreeBSD.

What is a cryptosystem and why would you want to use one? A cryptosystem is a utility that uses a combination of algorithms to provide the following three components: privacy, integrity, and authenticity. Different cryptosytems use different algorithms, but all cryptosystems provide those three components. Each is important, so let's take a look at each individually.

Privacy

Privacy ensures that only the intended recipient understands the network transmission. Even if a packet sniffer captures the data, it won't be able to decode the contents of the message. The cryptosystem uses an encryption algorithm, or cipher, to encrypt the original clear text into cipher text before it is transmitted. The intended recipient uses a key to decrypt the cipher text back into the original clear text. This key is shared between the sender and the recipient, and it is used to both encrypt and decrypt the data. Obviously, to ensure the privacy of the data, it is crucial that only the intended recipient has the key, for anyone with the key can decrypt the data.

It is possible for someone without the key to decrypt the data by cracking or guessing the key that was used to encrypt the data. The strength of the encryption algorithm gives an indication of how difficult it is to crack the key. Normally, strengths are expressed in terms of bitsize. For example, it would take less time to crack a key created by an algorithm with a 56-bit size than it would for a key created by an algorithm with a 256-bit size.

Related Reading

Practical UNIX and Internet Security
By Simson Garfinkel, Gene Spafford, Alan Schwartz

Does this mean you should always choose the algorithm with the largest bit size? Not necessarily. Typically, as bit size increases, the longer it takes to encrypt and decrypt the data. In practical terms, this translates into more work for the CPU and slower network transmissions. Choose a bit size that is suited to the sensitivity of the data you are transmitting and the hardware you have. The increase in CPU power over the years has resulted in a double-edged sword. It has allowed the use of stronger encryption algorithms, but it has also reduced the time it takes to crack the key created by those algorithms. Because of this, you should change the key periodically, before it is cracked. Many cryptosystems automate this process for you.

There are some other considerations when choosing an encryption algorithm. Some encryption algorithms are patented and require licenses or restrict their usage. Some encryption algorithms have been successfully exploited or are easily cracked. Some algorithms are faster or slower than their bit size would indicate. For example, DES and 3DES are considered to be slow; Blowfish is considered to be very fast, despite its large bit size.

Legal considerations also vary from country to country. Some countries impose export restrictions. This means that it is okay to use the full strength of an encryption algorithm within the borders of the country, but there are restrictions for encrypting data that has a recipient outside of the country. The United States used to restrict the strength of any algorithm leaving the U.S. border to 40 bits, which is why some algorithms support the very short bit size of 40 bits.

There are still countries where it is illegal to even use encryption. If you are unsure if your particular country has any legal or export restrictions, do a bit of research before you configure your FreeBSD system to use encryption.

The following table compares the encryption algorithms you are most likely to come across.

AlgorithmBit SizePatentedComment
DES56 slow, easily cracked
3DES168 slow
Blowfish32 - 448noextremely fast
IDEA128yes 
CAST40 - 128yes 
Arcfour40, 128  
AES (Rijndael)128, 192, 256nofast
Twofish128, 256nofast

How much of the original packet is encrypted depends upon the encryption mode. If a cryptosystem uses transport mode, only the data portion of the packet is encrypted, leaving the original headers in clear text. This means that a packet sniffer won't be able to read the actual data but will be able to determine the IP addresses of the sender and recipient and which port number (or application) sent the data.

If a cryptosystem uses tunnel mode, the entire packet, data and headers, is encrypted. Since the packet still needs to be routed to its final destination, a new Layer 3 header is created. This is known as encapsulation, and it is quite possible that the new header contains totally different IP addresses than the original IP header. We will see why in a later article when we configure your FreeBSD system for IPSEC.

Integrity

Integrity is the second component found in cryptosystems. This component ensures that the data received is indeed the data that was sent and that the data wasn't tampered with during transit. It requires a different class of algorithms, known as cryptographic checksums or cryptographic hashes. You may already be familiar with checksums as they are used to ensure that all of the bits in a frame or a header arrived in the order they were sent. However, frame and header checksums use a very simple algorithm, meaning that it is mathematically possible to change the bits and still use the same checksum. Cryptographic checksums need to be more tamper-resistant.

Like encryption algorithms, cryptographic checksums vary in their effectiveness. The longer the checksum, the harder it is to change the data and recreate the same checksum. Also, some checksums have known flaws. The following table summarizes the cryptographic checksums:

Cryptographic
Checksum
Checksum lengthKnown flaws
MD4128yes
MD5128theoretical
SHA160theoretical
SHA-1160not yet

The order in the above chart is intentional. When it comes to cryptographic checksums, MD4 is the least secure, and SHA-1 is the most secure. Always choose the most secure checksum available in your cryptosystem.

Another term to look for in a cryptographic checksum is HMAC or Hash-based Message Authentication Code. This indicates that the checksum algorithm uses a key as part of the checksum. This is good, as it's impossible to alter the checksum without access to the key. If a cryptographic checksum uses HMAC, you'll see that term before the name of the checksum. For example, HMAC-MD4 is more secure than MD4, HMAC-SHA is more secure than SHA. If we were to order the checksum algorithms from least secure to most secure, it would look like this:

Authenticity

So far, we've ensured that the data has been encrypted and that the data hasn't been altered during transit. However, all of that work would be for naught if the data, and more importantly, the key, were mistakenly sent to the wrong recipient. This is where the third component, or authenticity, comes into play.

Before any encryption can occur, a key has to be created and exchanged. Since the same key is used to encrypt and to decrypt the data during the session, it is known as a symmetric or session key. How do we safely exchange that key in the first place? How can we be sure that we just exchanged that key with the intended recipient and no one else?

This requires yet another class of algorithms known as asymmetric or public key algorithms. These algorithms are called asymmetric as the sender and recipient do not share the same key. Instead, both the sender and the recipient separately generate a key pair which consists of two mathematically related keys. One key, known as the public key, is exchanged. This means that the recipient has a copy of the sender's public key and vice versa. The other key, known as the private key, must be kept private. The security depends upon the fact that no one else has a copy of a user's private key. If a user suspects that his private key has been compromised, he should immediately revoke that key pair and generate a new key pair.

When a key pair is generated, it is associated with a unique string of short nonsense words known as a fingerprint. The fingerprint is used to ensure that you are viewing the correct public key. (Remember, you never get to see anyone else's private key.) In order to verify a recipient, they first need to send you a copy of their public key. You then need to double-check the fingerprint with the other person to ensure you did indeed get their public key. This will make more sense in the next article when we generate a key pair and you see a fingerprint for yourself.

The most common key generation algorithm is RSA. You'll often see the term RSA associated with digital certificates or certificate authorities, also known as CAs. A digital certificate is a signed file that contains a recipient's public key, some information about the recipient, and an expiration date. The X.509 or PKCS #9 standard dictates the information found in a digital certificate. You can read the standard for yourself at http://www.rsasecurity.com/rsalabs/pkcs or http://ftp.isi.edu/in-notes/rfc2985.txt.

Digital certificates are usually stored on a computer known as a Certificate Authority. This means that you don't have to exchange public keys with a recipient manually. Instead, your system will query the CA when it needs a copy of a recipient's public key. This provides for a scalable authentication system. A CA can store the digital certificates of many recipients, and those recipients can be either users or computers.

It is also possible to generate digital certificates using an algorithm known as DSA. However, this algorithm is patented and is slower than RSA. Here is a FAQ on the difference between RSA and DSA. (The entire RSA Laboratories' FAQ is very good reading if you would like a more in depth understanding of cryptography.)

There is one last point to make on the subject of digital certificates and CAs. A digital certificate contains an expiration date, and the certificate cannot be deleted from the CA before that date. What if a private key becomes compromised before that date? You'll obviously want to generate a new certificate containing the new public key. However, you can't delete the old certificate until it expires. To ensure that certificate won't inadvertently be used to authenticate a recipient, you can place it in the CRL or Certificate Revocation List. Whenever a certificate is requested, the CRL is read to ensure that the certificate is still valid.

Authenticating the recipient is one half of the authenticity component. The other half involves generating and exchanging the information that will be used to create the session key which in turn will be used to encrypt and decrypt the data. This again requires an asymmetric algorithm, but this time it is usually the Diffie Hellman, or DH, algorithm.

It is important to realize that Diffie Hellman doesn't make the actual session key itself, but the keying information used to generate that key. This involves a fair bit of fancy math which isn't for the faint of heart. The best explanation I've come across, in understandable language with diagrams, is Diffie-Hellman Key Exchange - A Non-Mathematician's Explanation by Keith Palmgren.

It is important that the keying information is kept as secure as possible, so the larger the bit size, the better. The possible Diffie Hellman bit sizes have been divided into groups. The following chart summarizes the possible Diffie Hellman Groups:

Group NameBit Size
1768
21024
51536

When configuring a cryptosytem, you should use the largest Diffie Hellman Group size that it supports.

The other term you'll see associated with the keying information is PFS, or Perfect Forward Secrecy, which Diffie Hellman supports. PFS ensures that the new keying information is not mathematically related to the old keying information. This means that if someone sniffs an old session key, they won't be able to use that key to guess the new session key. PFS is always a good thing and you should use it if the cryptosytem supports it.

Putting It All Together

Let's do a quick recap and summarize how a cryptosytem protects the data transmitted onto a network.

  1. First, the recipient's public key is used to verify that you are sending the data to the correct recipient. That public key was created by the RSA algorithm and is typically stored in a digital certificate that resides on a CA.
  2. Once the recipient is verified, the DH algorithm is used to create the information that will be used to create the session key.
  3. Once the keying information is available, a key that is unique to that session is created. This key is used by both the sender and the receiver to encrypt and decrypt the data they send to each other. It is important that this key changes often.
  4. Before the data is encrypted, a cryptographic checksum is calculated. Once the data is decrypted, the cryptographic checksum is recalculated to ensure that the recipient has received the original message.

In next week's article, you'll have the opportunity to see many of these cryptographic terms in action as we'll be configuring a cryptosytem that comes built-in to your FreeBSD system: ssh.

Dru Lavigne is a network and systems administrator, IT instructor, author and international speaker. She has over a decade of experience administering and teaching Netware, Microsoft, Cisco, Checkpoint, SCO, Solaris, Linux, and BSD systems. A prolific author, she pens the popular FreeBSD Basics column for O'Reilly and is author of BSD Hacks and The Best of FreeBSD Basics.


Read more FreeBSD Basics columns.

Return to the BSD DevCenter.


Copyright © 2009 O'Reilly Media, Inc.