gnupg/doc/gph/c2.sgml

<chapter id="concepts" xreflabel="2">
<docinfo>
<date>
$Id$
</date>
</docinfo>
<title>
Concepts
</title>

<para>
&Gnupg; makes uses of several cryptographic concepts including 
<firstterm>symmetric ciphers</firstterm>, 
<firstterm>public-key ciphers</firstterm>, and 
<firstterm>one-way hashing</firstterm>.
You can make basic use &gnupg; without fully understanding these concepts,
but in order to use it wisely some understanding of them is necessary.
</para>

<para>
This chapter introduces the basic cryptographic concepts used in GnuPG.
Other books cover these topics in much more detail.
A good book with which to pursue further study is 
<ulink url="http://www.counterpane.com/schneier.html">Bruce 
Schneier</ulink>'s
<ulink url="http://www.counterpane.com/applied.html">"Applied 
Cryptography"</ulink>.
</para>

<sect1>
<title>
Symmetric ciphers
</title>

<para>
A symmetric cipher is a cipher that uses the same key for both encryption
and decryption.
Two parties communicating using a symmetric cipher must agree on the
key beforehand.
Once they agree, the sender encrypts a message using the key, sends it
to the receiver, and the receiver decrypts the message using the key.
As an example, the German Enigma is a symmetric cipher, and daily keys
were distributed as code books.
Each day, a sending or receiving radio operator would consult his copy
of the code book to find the day's key.
Radio traffic for that day was then encrypted and decrypted using the
day's key.
Modern examples of symmetric ciphers include 3DES, Blowfish, and IDEA.
</para>

<para>
A good cipher puts all the security in the key and none in the algorithm.
In other words, it should be no help to an attacker if he knows which
cipher is being used.
Only if he obtains the key would knowledge of the algorithm be needed.
The ciphers used in &gnupg; have this property.
</para>

<para>
Since all the security is in the key, then it is important that it be
very difficult to guess the key.
In other words, the set of possible keys, &ie;, the <emphasis>key
space</emphasis>, needs
to be large.
While at Los Alamos, Richard Feynman was famous for his ability to
crack safes.
To encourage the mystique he even carried around a set of tools
including an old stethoscope.
In reality, he used a variety of tricks to reduce the number of
combinations he had to try to a small number and then simply guessed
until he found the right combination.
In other words, he reduced the size of the key space.
</para>

<para>
Britain used machines to guess keys during World War 2.
The German Enigma had a very large key space, but the British built
speciailzed computing engines, the Bombes, to mechanically try 
keys until the day's key was found.
This meant that sometimes they found the day's key within hours of
the new key's use, but it also meant that on some days they never
did find the right key.
The Bombes were not general-purpose computers but were precursors
to modern-day computers.
</para>

<para>
Today, computers can guess keys very quickly, and this is why key
size is important in modern cryptosystems.
The cipher DES uses a 56-bit key, which means that there are
<!-- inlineequation -->
2<superscript>56</superscript> possible keys.
<!-- inlineequation -->
2<superscript>56</superscript> is 72,057,594,037,927,936 keys.
This is a lot of keys, but a general-purpose computer can check the
entire key space in a matter of days.
A specialized computer can check it in hours.
On the other hand, more recently designed ciphers such as 3DES, 
Blowfish, and IDEA
<!-- inlineequation -->
all use 128-bit keys, which means there are 2<superscript>128</superscript> 
possible keys.
This is many, many more keys, and even if all the computers on the
planet cooperated, it could still take more time than the age of
the universe to find the key.
</para>
</sect1>

<sect1>
<title>
Public-key ciphers
</title>

<para>
The primary problem with symmetric ciphers is not their security but
with key exchange.
Once the sender and receiver have exchanged keys, that key can be
used to securely communicate, but what secure communication channel
was used to communicate the key itself?
In particular, it would probably be much easier for an attacker to work
to intercept the key than it is to try all the keys in the key space.
Another problem is the number of keys needed.
<!-- inlineequation -->
If there are <emphasis>n</emphasis> people who need to communicate, then 
<!-- inlineequation -->
<emphasis>n(n-1)/2</emphasis> keys
are needed for each pair of people to communicate privately.
This may be ok for a small number of people but quickly becomes unwieldly
for large groups of people.
</para>

<para>
Public-key ciphers were invented to avoid the key-exchange problem
entirely.
A public-key cipher uses a pair of keys for sending messages.
The two keys belong to the person receiving the message.
One key is a <emphasis>public key</emphasis> and may be given to anybody.
The other key is a <emphasis>private key</emphasis> and is kept 
secret by the owner.
A sender encrypts a message using the public key and once encrypted,
only the private key may be used to decrypt it.
</para>

<para>
This protocol solves the key-exchange problem inherent with symmetric
ciphers.
There is no need for the sender and receiver to agree
upon a key.
All that is required is that some time before secret communication the
sender gets a copy of the receiver's public key.
Furthermore, the one public key can be used by anybody wishing to
communicate with the receiver.
<!-- inlineequation -->
So only <emphasis>n</emphasis> keypairs are needed for <emphasis>n</emphasis> 
people to communicate secretly
with one another,
</para>

<para>
Public-key ciphers are based on one-way trapdoor functions.
A one-way function is a function that is easy to compute,
but the inverse is hard to compute.
For example, it is easy to multiply two prime numbers together to get
a composite, but it is difficult to factor a composite into its prime
components.a
A one-way trapdoor function is similar, but it has a trapdoor.
That is, if some piece of information is known, it becomes easy
to compute the inverse.
For example, if you have a number made of two prime factors, then knowing
one of the factors makes it easy to compute the second.
Given a public-key cipher based on prime factorization, the public
key contains a composite number made from two large prime factors, and
the encryption algorithm uses that composite to encrypt the
message.
The algorithm to decrypt the message requires knowing the prime factors,
so decryption is easy if you have the private key containing one of the
factors but extremely difficult if you do not have it.
</para>

<para>
As with good symmetric ciphers, with a good public-key cipher all of the
security rests with the key.
Therefore, key size is a measure of the system's security, but
one cannot compare the size of a symmetric cipher key and a public-key
cipher key as a measure of their relative security.
In a brute-force attack on a symmetric cipher with a key size of 80 bits,
<!-- inlineequation -->
the attacker must enumerate up to 2<superscript>81</superscript>-1 keys to 
find the right key.
In a brute-force attack on a public-key cipher with a key size of 512 bits,
the attacker must factor a composite number encoded in 512 bits (up to
155 decimal digits).
The workload for the attacker is fundamentally different depending on
the cipher he is attacking.
While 128 bits is sufficient for symmetric ciphers, given today's factoring
technology public keys with 1024 bits are recommended for most purposes.
</para>
</sect1>

<sect1>
<title>
Hybrid ciphers
</title>

<para>
Public-key ciphers are no panacea.
Many symmetric ciphers are stronger from a security standpoint,
and public-key encryption and decryption are more expensive than the
corresponding operations in symmetric systems.
Public-key ciphers are nevertheless an effective tool for distributing
symmetric cipher keys, and that is how they are used in hybrid cipher
systems.
</para>

<para>
A hybrid cipher uses both a symmetric cipher and a public-key cipher.
It works by using a public-key cipher to share a key for the symmetric
cipher.
The actual message being sent is then encrypted using the key and sent
to the recipient.
Since symmetric key sharing is secure, the symmetric key used is different
for each message sent.
Hence it is sometimes called a session key.
</para>

<para>
Both PGP and &gnupg; use hybrid ciphers.
The session key, encrypted using the public-key cipher, and the message
being sent, encrypted with the symmetric cipher, are automatically
combined in one package.
The recipient uses his private-key to decrypt the session key and the
session key is then used to decrypt the message.
</para>

<para>
A hybrid cipher is no stronger than the public-key cipher or symmetric
cipher it uses, whichever is weaker.
In PGP and &gnupg;, the public-key cipher is probably the weaker of
the pair.
Fortunately, however, if an attacker could decrypt a session key it
would only be useful for reading the one message encrypted with that
session key.
The attacker would have to start over and decrypt another session
key in order to read any other message.
</para>
</sect1>

<sect1>
<title>
Digital signatures
</title>

<para>
A hash function is a many-to-one function that maps its input to a
value in a finite set.
Typically this set is a range of natural numbers.
<!-- inlineequation -->
A simple ehash function is <emphasis>f</emphasis>(<emphasis>x</emphasis>) = 0 
for all integers <emphasis>x</emphasis>.
A more interesting hash function is 
<emphasis>f</emphasis>(<emphasis>x</emphasis>) = <emphasis>x</emphasis> 
<emphasis>mod</emphasis> 37, which
maps <emphasis>x</emphasis> to the remainder of dividing <emphasis>x</emphasis> by 37.
</para>

<para>
A document's digital signature is the result of applying a hash
function to the document.
To be useful, however, the hash function needs to satisfy two
important properties.
First, it should be hard to find two documents that hash to the
same value.
Second, given a hash value it should be hard to recover the document
that produced that value.
</para>

<para>
Some public-key ciphers<footnote><para>
The cipher must have the property that the actual public key or private
key could be used by the encryption algorithm as the public key.
RSA is an example of such an algorithm while ElGamal is not an example.
</para>
</footnote> could be used to sign documents.
The signer encrypts the document with his <emphasis>private</emphasis> key.
Anybody wishing to check the signature and see the document simply
uses the signer's public key to decrypt the document.
This algorithm does satisfy the two properties needed from a good hash
function, but in practice, this algorithm is too slow to be useful.
</para>

<para>
An alternative is to use hash functions designed to satisfy these
two important properties.
SHA and MD5 are examples of such algorithms.
Using such an algorithm, a document is signed by hashing it, and
the hash value is the signature.
Another person can check the signature by also hashing their copy of the
document and comparing the hash value they get with the hash value of
the original document.
If they match, it is almost certain that the documents are identical.
</para>

<para>
Of course, the problem now is using a hash function for digital
signatures without permitting an attacker to interfere with signature
checking.
If the document and signature are sent unencrypted, an attacker could
modify the document and generate a corresponding signature without the
recipient's knowledge.
If only the document is encrypted, an attacker could tamper with the
signature and cause a signature check to fail.
A third option is to use a hybrid public-key encryption to encrypt both
the signature and document.
The signer uses his private key, and anybody can use his public key
to check the signature and document.
This sounds good but is actually nonsense.
If this algorithm truly secured the document it would also
secure it from tampering and there would be no need for the signature.
The more serious problem, however, is that this does not protect either
the signature or document from tampering.
With this algorithm, only the session key for the symmetric cipher
is encrypted using the signer's private key.
Anybody can use the public key to recover the session key.
Therefore, it is straightforward for an attacker to recover the session
key and use it to encrypt substitute documents and signatures to send
to others in the sender's name.
</para>

<para>
An algorithm that does work is to use a public key algorithm to
encrypt only the signature.
In particular, the hash value is encrypted using the signer's private
key, and anbody can check the signature using the public key.
The signed document can be sent using any other encryption algorithm
including none if it is a public document.
If the document is modified the signature check will fail, but this
is precisely what the signature check is supposed to catch.
The Digital Signature Standard (DSA) is a public key signature 
algorithm that works as just described.
DSA is the primary signing algorithm used in &Gnupg;.
</para>

</sect1>
</chapter>
See ChangeLog: Mon Sep 6 19:59:08 CEST 1999 Werner Koch 1999-09-06 20:10:27 +02:00			`<chapter id="concepts" xreflabel="2">`
			`<docinfo>`
			`<date>`
			$Id$
			`</date>`
			`</docinfo>`
			`<title>`
			`Concepts`
			`</title>`

			`<para>`
			`&Gnupg; makes uses of several cryptographic concepts including`
			`<firstterm>symmetric ciphers</firstterm>,`
			`<firstterm>public-key ciphers</firstterm>, and`
			`<firstterm>one-way hashing</firstterm>.`
			`You can make basic use &gnupg; without fully understanding these concepts,`
			`but in order to use it wisely some understanding of them is necessary.`
			`</para>`

			`<para>`
			`This chapter introduces the basic cryptographic concepts used in GnuPG.`
			`Other books cover these topics in much more detail.`
			`A good book with which to pursue further study is`
			`<ulink url="http://www.counterpane.com/schneier.html">Bruce`
			`Schneier</ulink>'s`
			`<ulink url="http://www.counterpane.com/applied.html">"Applied`
			`Cryptography"</ulink>.`
			`</para>`

			`<sect1>`
			`<title>`
			`Symmetric ciphers`
			`</title>`

			`<para>`
			`A symmetric cipher is a cipher that uses the same key for both encryption`
			`and decryption.`
			`Two parties communicating using a symmetric cipher must agree on the`
			`key beforehand.`
			`Once they agree, the sender encrypts a message using the key, sends it`
			`to the receiver, and the receiver decrypts the message using the key.`
			`As an example, the German Enigma is a symmetric cipher, and daily keys`
			`were distributed as code books.`
			`Each day, a sending or receiving radio operator would consult his copy`
			`of the code book to find the day's key.`
			`Radio traffic for that day was then encrypted and decrypted using the`
			`day's key.`
			`Modern examples of symmetric ciphers include 3DES, Blowfish, and IDEA.`
			`</para>`

			`<para>`
			`A good cipher puts all the security in the key and none in the algorithm.`
			`In other words, it should be no help to an attacker if he knows which`
			`cipher is being used.`
			`Only if he obtains the key would knowledge of the algorithm be needed.`
			`The ciphers used in &gnupg; have this property.`
			`</para>`

			`<para>`
			`Since all the security is in the key, then it is important that it be`
			`very difficult to guess the key.`
			`In other words, the set of possible keys, &ie;, the <emphasis>key`
			`space</emphasis>, needs`
			`to be large.`
			`While at Los Alamos, Richard Feynman was famous for his ability to`
			`crack safes.`
			`To encourage the mystique he even carried around a set of tools`
			`including an old stethoscope.`
			`In reality, he used a variety of tricks to reduce the number of`
			`combinations he had to try to a small number and then simply guessed`
			`until he found the right combination.`
			`In other words, he reduced the size of the key space.`
			`</para>`

			`<para>`
			`Britain used machines to guess keys during World War 2.`
			`The German Enigma had a very large key space, but the British built`
			`speciailzed computing engines, the Bombes, to mechanically try`
			`keys until the day's key was found.`
			`This meant that sometimes they found the day's key within hours of`
			`the new key's use, but it also meant that on some days they never`
			`did find the right key.`
			`The Bombes were not general-purpose computers but were precursors`
			`to modern-day computers.`
			`</para>`

			`<para>`
			`Today, computers can guess keys very quickly, and this is why key`
			`size is important in modern cryptosystems.`
			`The cipher DES uses a 56-bit key, which means that there are`
			`<!-- inlineequation -->`
			`2<superscript>56</superscript> possible keys.`
			`<!-- inlineequation -->`
			`2<superscript>56</superscript> is 72,057,594,037,927,936 keys.`
			`This is a lot of keys, but a general-purpose computer can check the`
			`entire key space in a matter of days.`
			`A specialized computer can check it in hours.`
			`On the other hand, more recently designed ciphers such as 3DES,`
			`Blowfish, and IDEA`
			`<!-- inlineequation -->`
			`all use 128-bit keys, which means there are 2<superscript>128</superscript>`
			`possible keys.`
			`This is many, many more keys, and even if all the computers on the`
			`planet cooperated, it could still take more time than the age of`
			`the universe to find the key.`
			`</para>`
			`</sect1>`

			`<sect1>`
			`<title>`
			`Public-key ciphers`
			`</title>`

			`<para>`
			`The primary problem with symmetric ciphers is not their security but`
			`with key exchange.`
			`Once the sender and receiver have exchanged keys, that key can be`
			`used to securely communicate, but what secure communication channel`
			`was used to communicate the key itself?`
			`In particular, it would probably be much easier for an attacker to work`
			`to intercept the key than it is to try all the keys in the key space.`
			`Another problem is the number of keys needed.`
			`<!-- inlineequation -->`
			`If there are <emphasis>n</emphasis> people who need to communicate, then`
			`<!-- inlineequation -->`
			`<emphasis>n(n-1)/2</emphasis> keys`
			`are needed for each pair of people to communicate privately.`
			`This may be ok for a small number of people but quickly becomes unwieldly`
			`for large groups of people.`
			`</para>`

			`<para>`
			`Public-key ciphers were invented to avoid the key-exchange problem`
			`entirely.`
			`A public-key cipher uses a pair of keys for sending messages.`
			`The two keys belong to the person receiving the message.`
			`One key is a <emphasis>public key</emphasis> and may be given to anybody.`
			`The other key is a <emphasis>private key</emphasis> and is kept`
			`secret by the owner.`
			`A sender encrypts a message using the public key and once encrypted,`
			`only the private key may be used to decrypt it.`
			`</para>`

			`<para>`
			`This protocol solves the key-exchange problem inherent with symmetric`
			`ciphers.`
			`There is no need for the sender and receiver to agree`
			`upon a key.`
			`All that is required is that some time before secret communication the`
			`sender gets a copy of the receiver's public key.`
			`Furthermore, the one public key can be used by anybody wishing to`
			`communicate with the receiver.`
			`<!-- inlineequation -->`
			`So only <emphasis>n</emphasis> keypairs are needed for <emphasis>n</emphasis>`
			`people to communicate secretly`
			`with one another,`
			`</para>`

			`<para>`
			`Public-key ciphers are based on one-way trapdoor functions.`
			`A one-way function is a function that is easy to compute,`
			`but the inverse is hard to compute.`
			`For example, it is easy to multiply two prime numbers together to get`
			`a composite, but it is difficult to factor a composite into its prime`
			`components.a`
			`A one-way trapdoor function is similar, but it has a trapdoor.`
			`That is, if some piece of information is known, it becomes easy`
			`to compute the inverse.`
			`For example, if you have a number made of two prime factors, then knowing`
			`one of the factors makes it easy to compute the second.`
			`Given a public-key cipher based on prime factorization, the public`
			`key contains a composite number made from two large prime factors, and`
			`the encryption algorithm uses that composite to encrypt the`
			`message.`
			`The algorithm to decrypt the message requires knowing the prime factors,`
			`so decryption is easy if you have the private key containing one of the`
			`factors but extremely difficult if you do not have it.`
			`</para>`

			`<para>`
			`As with good symmetric ciphers, with a good public-key cipher all of the`
			`security rests with the key.`
			`Therefore, key size is a measure of the system's security, but`
			`one cannot compare the size of a symmetric cipher key and a public-key`
			`cipher key as a measure of their relative security.`
			`In a brute-force attack on a symmetric cipher with a key size of 80 bits,`
			`<!-- inlineequation -->`
			`the attacker must enumerate up to 2<superscript>81</superscript>-1 keys to`
			`find the right key.`
			`In a brute-force attack on a public-key cipher with a key size of 512 bits,`
			`the attacker must factor a composite number encoded in 512 bits (up to`
			`155 decimal digits).`
			`The workload for the attacker is fundamentally different depending on`
			`the cipher he is attacking.`
			`While 128 bits is sufficient for symmetric ciphers, given today's factoring`
			`technology public keys with 1024 bits are recommended for most purposes.`
			`</para>`
			`</sect1>`

			`<sect1>`
			`<title>`
			`Hybrid ciphers`
			`</title>`

			`<para>`
			`Public-key ciphers are no panacea.`
			`Many symmetric ciphers are stronger from a security standpoint,`
			`and public-key encryption and decryption are more expensive than the`
			`corresponding operations in symmetric systems.`
			`Public-key ciphers are nevertheless an effective tool for distributing`
			`symmetric cipher keys, and that is how they are used in hybrid cipher`
			`systems.`
			`</para>`

			`<para>`
			`A hybrid cipher uses both a symmetric cipher and a public-key cipher.`
			`It works by using a public-key cipher to share a key for the symmetric`
			`cipher.`
			`The actual message being sent is then encrypted using the key and sent`
			`to the recipient.`
			`Since symmetric key sharing is secure, the symmetric key used is different`
			`for each message sent.`
			`Hence it is sometimes called a session key.`
			`</para>`

			`<para>`
			`Both PGP and &gnupg; use hybrid ciphers.`
			`The session key, encrypted using the public-key cipher, and the message`
			`being sent, encrypted with the symmetric cipher, are automatically`
			`combined in one package.`
			`The recipient uses his private-key to decrypt the session key and the`
			`session key is then used to decrypt the message.`
			`</para>`

			`<para>`
			`A hybrid cipher is no stronger than the public-key cipher or symmetric`
			`cipher it uses, whichever is weaker.`
			`In PGP and &gnupg;, the public-key cipher is probably the weaker of`
			`the pair.`
			`Fortunately, however, if an attacker could decrypt a session key it`
			`would only be useful for reading the one message encrypted with that`
			`session key.`
			`The attacker would have to start over and decrypt another session`
			`key in order to read any other message.`
			`</para>`
			`</sect1>`

			`<sect1>`
			`<title>`
			`Digital signatures`
			`</title>`

			`<para>`
			`A hash function is a many-to-one function that maps its input to a`
			`value in a finite set.`
			`Typically this set is a range of natural numbers.`
			`<!-- inlineequation -->`
			`A simple ehash function is <emphasis>f</emphasis>(<emphasis>x</emphasis>) = 0`
			`for all integers <emphasis>x</emphasis>.`
			`A more interesting hash function is`
			`<emphasis>f</emphasis>(<emphasis>x</emphasis>) = <emphasis>x</emphasis>`
			`<emphasis>mod</emphasis> 37, which`
			`maps <emphasis>x</emphasis> to the remainder of dividing <emphasis>x</emphasis> by 37.`
			`</para>`

			`<para>`
			`A document's digital signature is the result of applying a hash`
			`function to the document.`
			`To be useful, however, the hash function needs to satisfy two`
			`important properties.`
			`First, it should be hard to find two documents that hash to the`
			`same value.`
			`Second, given a hash value it should be hard to recover the document`
			`that produced that value.`
			`</para>`

			`<para>`
			`Some public-key ciphers<footnote><para>`
			`The cipher must have the property that the actual public key or private`
			`key could be used by the encryption algorithm as the public key.`
			`RSA is an example of such an algorithm while ElGamal is not an example.`
			`</para>`
			`</footnote> could be used to sign documents.`
			`The signer encrypts the document with his <emphasis>private</emphasis> key.`
			`Anybody wishing to check the signature and see the document simply`
			`uses the signer's public key to decrypt the document.`
			`This algorithm does satisfy the two properties needed from a good hash`
			`function, but in practice, this algorithm is too slow to be useful.`
			`</para>`

			`<para>`
			`An alternative is to use hash functions designed to satisfy these`
			`two important properties.`
			`SHA and MD5 are examples of such algorithms.`
			`Using such an algorithm, a document is signed by hashing it, and`
			`the hash value is the signature.`
			`Another person can check the signature by also hashing their copy of the`
			`document and comparing the hash value they get with the hash value of`
			`the original document.`
			`If they match, it is almost certain that the documents are identical.`
			`</para>`

			`<para>`
			`Of course, the problem now is using a hash function for digital`
			`signatures without permitting an attacker to interfere with signature`
			`checking.`
			`If the document and signature are sent unencrypted, an attacker could`
			`modify the document and generate a corresponding signature without the`
			`recipient's knowledge.`
			`If only the document is encrypted, an attacker could tamper with the`
			`signature and cause a signature check to fail.`
			`A third option is to use a hybrid public-key encryption to encrypt both`
			`the signature and document.`
			`The signer uses his private key, and anybody can use his public key`
			`to check the signature and document.`
			`This sounds good but is actually nonsense.`
			`If this algorithm truly secured the document it would also`
			`secure it from tampering and there would be no need for the signature.`
			`The more serious problem, however, is that this does not protect either`
			`the signature or document from tampering.`
			`With this algorithm, only the session key for the symmetric cipher`
			`is encrypted using the signer's private key.`
			`Anybody can use the public key to recover the session key.`
			`Therefore, it is straightforward for an attacker to recover the session`
			`key and use it to encrypt substitute documents and signatures to send`
			`to others in the sender's name.`
			`</para>`

			`<para>`
			`An algorithm that does work is to use a public key algorithm to`
			`encrypt only the signature.`
			`In particular, the hash value is encrypted using the signer's private`
			`key, and anbody can check the signature using the public key.`
			`The signed document can be sent using any other encryption algorithm`
			`including none if it is a public document.`
			`If the document is modified the signature check will fail, but this`
			`is precisely what the signature check is supposed to catch.`
			`The Digital Signature Standard (DSA) is a public key signature`
			`algorithm that works as just described.`
			`DSA is the primary signing algorithm used in &Gnupg;.`
			`</para>`

			`</sect1>`
			`</chapter>`