Cryptography - Elia Torre

**Sketching a Blockchain** ![[Bocconi/Bocconi - Introduction to Blockchain/Images - Bocconi Introduction to Blockchain/image23.png]]] **Cryptography** Can you read this document at all? ![[Bocconi/Bocconi - Introduction to Blockchain/Images - Bocconi Introduction to Blockchain/image24.png]] Well, it's... ![[Bocconi/Bocconi - Introduction to Blockchain/Images - Bocconi Introduction to Blockchain/image25.png]] **Cryptography** or **Cryptology** come from Ancient Greek: kryptós (hidden, secret) and gaphein (to write) or logia (study). The idea is to protect data from being accessed by unauthorized people. It provides a mechanism for securely encoding the set of rules in the system. Cryptography is a deep academic research field utilizing many advanced mathematical techniques. Computer science also has its own branch focusing on solving cryptography problems. **Symmetric vs Asymmetric Cryptography** **Definition**: a key is random string consisting of hundreds or thousands of ones and zeroes (i.e. binary digits). The key is used by a cryptographic algorithm to transform plain text into cipher text or vice versa. **Symmetric vs Asymmetric keys**: - Symmetric algorithms use a single key for both encryption and decryption. - Asymmetric algorithms use two different keys: a private and a public one, hence the name. **The example** Let's assume that two friends want to communicate securely over the Internet. How do you get shared keys? The framework: Diffie-Hellman key exchange protocol. Aragorn, Legolas, Gimli and other friends are facing the Uruks at the Helm's Deep. They are in desperate need of assistance in order to defeat Sauron's army. They decide to use cryptography to send a message to Gandalf the White who can arrive with the Rohirrim. Aragorn and Gandalf came up with this protocol: - Share an arbitrary number R (public key) (don't care if intercepted) - Each think of their own secret keys without sharing them: - Aragorn picks "a" - Gandalf picks "g". |![[Bocconi/Bocconi - Introduction to Blockchain/Images - Bocconi Introduction to Blockchain/image26.png]] | ![[Bocconi/Bocconi - Introduction to Blockchain/Images - Bocconi Introduction to Blockchain/image27.png]] | |---|---| We have the same secret key S, so problem solved! R and H\* have been exchanged! They can be intercepted! And if so, it would be trivial to know what a and g are. **The Log and Discrete Log problems** ![[Bocconi/Bocconi - Introduction to Blockchain/Images - Bocconi Introduction to Blockchain/image28.png]] With Discrete Log Problem, guessing is the way-to-go. Basis for Diffie-Hellman schemes. **Exponentiation is easy, but it's very hard to know secret keys**. **Moving to the discrete** **Modulus operation doesn't screw up exponentiation** ![[Bocconi/Bocconi - Introduction to Blockchain/Images - Bocconi Introduction to Blockchain/image29.png]] **So, what's the problem here?** The problem is that with increasing computer power and new algorithms it's getting easier to solve the problem! We have a conflict: keys have to be larger, but keys have to be smaller at the same time. **Elliptic Curves for the good!** ![[Bocconi/Bocconi - Introduction to Blockchain/Images - Bocconi Introduction to Blockchain/image30.png]] Elliptic curves pop up when solving elliptic functions over a given space. An elliptic curve is a curve that's also naturally a group. The group law is constructed geometrically. Elliptic curves have (almost) nothing to do with ellipses, so put ellipses and conic sections out of your thoughts. Elliptic curves appear in many diverse areas of mathematics, ranging from: - Number theory - Complex analysis - Cryptography - Mathematical Physics ![[Bocconi/Bocconi - Introduction to Blockchain/Images - Bocconi Introduction to Blockchain/image31.png]] **Property 1**: Elliptic curves are symmetric over the X-axis (aka, known as horizontal symmetry). **Property 2**: Given two points A and B, if we draw a straight line between A and B, that line will intercept the elliptic curve in at most one more point. |![[Bocconi/Bocconi - Introduction to Blockchain/Images - Bocconi Introduction to Blockchain/image33.png]] |![[Bocconi/Bocconi - Introduction to Blockchain/Images - Bocconi Introduction to Blockchain/image43.png]] | ![[Bocconi/Bocconi - Introduction to Blockchain/Images - Bocconi Introduction to Blockchain/image42.png]] | |---|---|---| |![[Bocconi/Bocconi - Introduction to Blockchain/Images - Bocconi Introduction to Blockchain/image41.png]] | ![[Bocconi/Bocconi - Introduction to Blockchain/Images - Bocconi Introduction to Blockchain/image40.png]] | ![[Bocconi/Bocconi - Introduction to Blockchain/Images - Bocconi Introduction to Blockchain/image39.png]] | |![[Bocconi/Bocconi - Introduction to Blockchain/Images - Bocconi Introduction to Blockchain/image38.png]] | ![[Bocconi/Bocconi - Introduction to Blockchain/Images - Bocconi Introduction to Blockchain/image37.png]] | ![[Bocconi/Bocconi - Introduction to Blockchain/Images - Bocconi Introduction to Blockchain/image36.png]] | |![[Bocconi/Bocconi - Introduction to Blockchain/Images - Bocconi Introduction to Blockchain/image35.png]] | ![[Bocconi/Bocconi - Introduction to Blockchain/Images - Bocconi Introduction to Blockchain/image34.png]] | ![[Bocconi/Bocconi - Introduction to Blockchain/Images - Bocconi Introduction to Blockchain/image32.png]] | **Some Properties of Addition** **Theorem**, the addition law on E has the following properties: ![[Bocconi/Bocconi - Introduction to Blockchain/Images - Bocconi Introduction to Blockchain/image44.png]] **Elliptic Curve Discrete Log Problem** Let's represent this operation in: ***nA = E***. Where *nA* does not represent standard multiplication. It's more of a ... "dot" operation which eventually yields to the generation of other points. - A dot A dot A ... Turns out, it's super hard to find n event if we had the starting point A and the arriving point E. In other words, it's hard to find how many times the "dot" operation has been done. **Order Independence is preserved** ![[Bocconi/Bocconi - Introduction to Blockchain/Images - Bocconi Introduction to Blockchain/image45.png]] **Elliptic Curve Diffie-Hellman (ECDH)** As we did with mod(p), with EC we have to limit the set of possible values. ![[Bocconi/Bocconi - Introduction to Blockchain/Images - Bocconi Introduction to Blockchain/image46.png]] **Elliptic Curves solve the issue** - Math involved in ECs is not vulnerable to the same algorithms - It's as easy to break a 228 bit EC as a 2380 bit DH - It seems we solved our problem: resilient to attacks and small in size. ![[Bocconi/Bocconi - Introduction to Blockchain/Images - Bocconi Introduction to Blockchain/image47.png]] **Elliptic Curve Cryptography** One method for implementing public key cryptography is the Elliptic Curve Cryptography (ECC). | ![[Bocconi/Bocconi - Introduction to Blockchain/Images - Bocconi Introduction to Blockchain/image49.png]] | ![[Bocconi/Bocconi - Introduction to Blockchain/Images - Bocconi Introduction to Blockchain/image48.png]] | |---|---| **Downsides of Elliptic Curves** - Difficult mathematical representation - Many are patented - Fail without sufficient randomness Bad RNGs lead to successful attacks. - Lack of theoretical foundation. Are they really secure? - Built-in trap doors (i.e., backdoors)? **Hash Functions** The metaphor: What is a fingerprint? - Impressions of the friction ridges of all or any part of the fingers of the human hand. - Highly considered since they identify human beings uniquely. - Used to investigate crimes, identify offenders, etc. We want to be able to identify the data by using its digital fingerprints. One way to achieve data identification is to use **hash functions**. Has functions are small computer programs which transform any kind of data into a string of fixed length, regardless the size of input data. Remember: the number of possible inputs is larger than the number of possible outputs. **A hash function is a many-to-one function**. We focus on special cases called **cryptographic hash functions**. ![[Bocconi/Bocconi - Introduction to Blockchain/Images - Bocconi Introduction to Blockchain/image50.png]] **Cryptographic Hash Functions** A cryptographic hash function is defined as a hash function which has the following 3 main properties: - **Preimage Resistance**: let y be the output of H for some unknown input. An input x which satisfies H(x) = y is called a preimage of y under H. By the many-to-one nature of H, preimages are not necessarily unique. The function H is said to be preimage resistant if it is computationally infeasible to calculate any preimage x of a given y. - **Second Presage Resistance**: given an input string x, it is computationally infeasible to find a different input string x' such that H(x) = H(x'). - **Collision Resistance**: it is computationally infeasible to find a pair of distinct input strings x, x' such that H(x) = H(x'). Let's add two more properties: - **Deterministic**: a hash function yields identical hash values for identical input data. - **Pseudorandom**: the hash value returned by the hash function changes (pseudo) randomly when the input data are changed. It should not be possible to predict the hash value based on the input data. What are the requirements to be a "good" hash function? How large does n (output length in bits) need to be? ![[Bocconi/Bocconi - Introduction to Blockchain/Images - Bocconi Introduction to Blockchain/image51.png]] Having a large output length is not sufficient for a hash function to be considered cryptographic. The relation between the input bit string x and the output bit string y should be complicated to prevent easy recovery of x from y. When can an input-output relationship be considered complicated? Really hard to answer! An example could be SHA-256. **Secure Hash Algorithm** The SHA-256 hash function was announced in 2001 by the National Institute of Standards and Technology (NIST). The number 256, we already know, indicates the output length in bits. The SHA-256 function specification restricts the input to be at most 2^64^ -- 1 bits long. Up to now, there is no formal proof that SHA-256 is in fact a cryptographic hash function. But it has no known weakness which make finding preimages, second presages and collisions computationally feasible. In other words, it would take an infinite amount of computational time to break it. **One hash value to rule them all...** What if we have to provide one unique hash value for several independent data? Simply, you can't use a hash function directly. Blockchain data structure has to deal with multiple transactions at once and requires one single hash value. ![[Bocconi/Bocconi - Introduction to Blockchain/Images - Bocconi Introduction to Blockchain/image52.png]] So, how do we solve this issue? **Hashing Patterns** The idea is to build structures made by hash functions. There exist different patterns: - Independent Hashing - Repeated Hashing - Combined Hashing - Sequential Hashing - Hierarchical Hashing **Independent and Repeated Hashing** |![[Bocconi/Bocconi - Introduction to Blockchain/Images - Bocconi Introduction to Blockchain/image53.png]] | ![[Bocconi/Bocconi - Introduction to Blockchain/Images - Bocconi Introduction to Blockchain/image54.png]] | |---|---| - **Independent Hashing**, just apply the hash function to each and every independent piece of data. - **Repeated Hashing** - Just apply the hash function repeatedly to the output of the step before. - You can indeed hashing an hash value. **Combined Hashing** ![[Bocconi/Bocconi - Introduction to Blockchain/Images - Bocconi Introduction to Blockchain/image55.png]] The goal is to compute a single hash value for more than one piece of data in one single step. The idea is to combine all the independent data into one and then apply the hash function. Combining data is computationally intensive: it costs time, memory allocation, storage. Be sure to apply this method when individual chunks of data are small. Also, we have no hash value associated with each chunk. **Sequential Hashing** ![[Bocconi/Bocconi - Introduction to Blockchain/Images - Bocconi Introduction to Blockchain/image56.png]] With sequential hashing we incrementally update of a hash value as new data arrive. It is a mixture of the combined and repeated hashing methods in this order. We first combined the chunks and compute a single hash value (combined hashing), then we compute another hash value on the previous one (repeated hashing). This comes in handy if we want to have a single hash value over time with a well tracked updating process. **Hierarchical Hashing** ![[Bocconi/Bocconi - Introduction to Blockchain/Images - Bocconi Introduction to Blockchain/image57.png]] Here we just want to have a single hash value which represents a given structure. It combines the independent hashing and the combined hashing. It's more efficient then combined hashing since here the chunks are simple hash values. **Digital Signatures** A digital signature is supposed to be the digital analog to a handwritten signature on paper. We desire **two properties** from digital signatures that correspond well to the handwritten signature technology: - Only you can make your signature, but anyone who sees it can verify that it's valid. - We want the signature to be tied to a particular document so that the signature cannot be used to indicate your agreement orendorsement of a different document. How can we build this in a digital form using cryptography? A digital signature scheme consists of 3 algorithms: ![[Bocconi/Bocconi - Introduction to Blockchain/Images - Bocconi Introduction to Blockchain/image58.png]] Two properties: - Valid signature must verify that: verify(pk, message, sign(sk, message)) == true. - It is straightforward that valid signatures must verify. - If someone signs a message with the secret key, and someone else later tries to validate that signature over that same message using the public key, the signature must validate correctly. - This is a basic requirement for signatures to be useful at all. - Signatures are **existentially unforgeable**. - It states that it's computationally infeasible to forge signatures. - That is, an adversary who knows your public key and gets to see your signatures on some other messages can't forge your signature on some message for which he has not seen your signature. - This unforgeability property is generally formalized in terms of a game that we play with an adversary. **Digital Signatures: the unforgeability game** In the unforgeability game, there is an adversary who claims that he can forge signatures and a challenger that will test this claim. The first thing we do is use *generateKeys()* to generate a secret signing key and a corresponding public verification key. We give the secret key to the challenger so he can make signatures. We give the public key both to the challenger and to the adversary. So the adversary only knows information that's public, and his mission is to try to forge a message. Intuitively, the setup of this game matches real world conditions. A real-life attacker would likely be able to see valid signatures from their would-be victim on a number of different documents. And maybe the attacker could even manipulate the victim into signing innocuous-looking documents if that's useful to the attacker. ![[Bocconi/Bocconi - Introduction to Blockchain/Images - Bocconi Introduction to Blockchain/image59.png]]