Hash functions



The meaning of the verb “to hash” – to chop or scramble something – provides a clue as to what hash functions do to data. That’s right, they “scramble” data and convert it into a numerical value. And no matter how long the input is, the output value is always of the same length. Hash functions are also referred to as hashing algorithms or message digest functions. They are used across many areas of computer science, for example:

  • To encrypt communication between web servers and browsers, and generate session IDs for internet applications and data caching
  • To protect sensitive data such as passwords, web analytics, and payment details
  • To add digital signatures to emails
  • To locate identical or similar data sets via lookup functions
 Definition

hash function converts strings of different length into fixed-length strings known as hash values or digests. You can use hashing to scramble passwords into strings of authorized characters for example. The output values cannot be inverted to produce the original input.


All of the different passwords are converted into fixed-length strings before being stored in the database. These output strings cannot be converted back to find out the actual passwords.


What are the properties of hash functions?

Hash functions are designed so that they have the following properties:

One-way

Once a hash value has been generated, it must be impossible to convert it back into the original data. For instance, in the example above, there must be no way of converting “$P$Hv8rpLanTSYSA/2bP1xN.S6Mdk32.Z3” back into “susi_562#alone”.

Collision-free

For a hash function to be collision-free, no two strings can map to the same output hash. In other words, every input string must generate a unique output string. This type of hash function is also referred to as a cryptographic hash function. In the example hash function above, there are no identical hash values, so there are no “collisions” between the output strings. Programmers use advanced technologies to prevent such collisions.

Lightning-fast

If it takes too long for a hash function to compute hash values, the procedure is not much use. Hash functions must, therefore, be very fast. In databases, hash values are stored in so-called hash tables to ensure fast access.

What is a hash value?

A hash value is the output string generated by a hash function. No matter the input, all of the output strings generated by a particular hash function are of the same length. The length is defined by the type of hashing technology used. The output strings are created from a set of authorized characters defined in the hash function.

Hash values generated using the SHA256 function are always of the same length, irrespective of the number and type of characters in the input string.

The hash value is the result calculated by the hash function and algorithm. Because hash values are unique, like human fingerprints, they are also referred to as “fingerprints”. If you take the lower-case letters “a” to “f” and the digits “0” to “9” and define a hash value length of 64 characters, there are 1.1579209e+77 possible output values – that’s 70 followed by 24 zeros! This shows that even with shorter strings, you can still generate acceptable fingerprints.

The “sha256” encryption algorithm is being used to hash the input value “apple”. The corresponding hash value or fingerprint is always “3a42c503953909637f78dd8c99b3b85ddde362415585afc11901bdefe8349102”.

Hash functions and websites

With SSL-encrypted data transmission, when the web server receives a request, it sends the server certificate to the user’s browser. A session ID is then generated using a hash function, and this is sent to the server where it is decrypted and verified. If the server approves the session ID, the encrypted HTTPS connection is established and data can be exchanged. All of the data packets exchanged are also encrypted, so it is almost impossiblefor hackers to gain access.An extract from the certificate for German broadcasting corporation Deutsche Welle, showing the key the server uses to establish a communication session with the user’s browser.

Session IDs are generated using data relating to a site visit, such as the IP address and time stamp, and communicated with the URL. One common use of session IDs is to give unique identifiers to people shopping on a website. Nowadays, session IDs are rarely passed as a URL parameter (for example, as something like www.domain.tld/index?sid=d4ccaf2627557c756a0762419a4b6695). Instead, they are stored as a cookie in the website header.

Hash values are also used to encrypt cached data to prevent unauthorized users from using the cache to access login and payment details or other information about a site.

Communication between an FTP server and a client using the SFTP protocol also works in a similar way.

Protection of sensitive data

Login details for online accounts are frequently the target of cyber-attacks. Hackers either want to disrupt operation of a website (for example, to reduce income generated by traffic-based ads) or access information about payment methods.The WordPress Content Management System offers a range of security functions for authenticating registered site users. The keys shown above were generated using various hashing algorithms.

In the WordPress example above, you can see that passwords are always encrypted before they are stored. Combined with the session IDs generated in the system, this ensures a high level of security. This is especially important for protection against “brute force attacks. In this kind of attack, hackers use their own hash functions to repeatedly try out combinations until they get a result that allows them access. Using long passwords with high security standards makes these attacks less likely to succeed, because the amount of computing power required is so high. Remember: Never use simple passwords, and be sure to protect all of your login details and data against unauthorized access.

Digital signatures

Email traffic is sent via servers that are specially designed to transmit this type of message. Keys generated using hash functions are also used to add a digital signature to messages.Adding a digital signature to an email is like signing a handwritten letter – you sign once, and your signature is unique.

The steps involved in sending an email with a digital signature are:

  • Alice (the sender) converts her message into a hash value and encrypts the hash value using her private key. This encrypted hash value is the digital signature.
  • Alice sends the email and the digital signature to the recipient, Bob.
  • Bob generates a hash value of the message using the same hash function. He also decrypts the hash value using Alice’s public key and compares the two hashes.
  • If the two hash values match, Bob knows that Alice’s message has not been tampered with during transmission.

Please note that a digital signature proves the integrity of a message but does not actually encrypt it. If you’re sending confidential data, it’s therefore best to encrypt it as well as using a digital signature.

How can hash functions be used to perform lookups?

Searching through large quantities of data is a very resource-intensive process. Imagine you’ve got a table listing every inhabitant of a big city, with lots of different fields for each entry (first name, second name, address, etc.). Finding just one term would be very time-consuming and require a lot of computing power. To simplify the process, each entry in the table can be converted into a unique hash value. The search term is then converted to a hash value. This limits the number of letters, digits and symbols that have to be compared, which is much more efficient than searching every field that exists in the data table, for example, for all first names beginning with “Ann”.

Comments

Popular posts from this blog

Rabbit Virus

PLC vs RTU vs IED