The Ultimate Manual For Tokenization

Tokenization has become a popular buzzword in the world of data security, as it can be applied to a wide range of industries and use cases. If your company stores and processes sensitive data, you need to understand how tokenization works. This guide will explain everything you need to know.

What is Tokenization Anyway?

Tokenization is the process of turning sensitive data into non-sensitive data. The sensitive data gets turned into “tokens” that retain all of the information without compromising its security.

In theory, tokenization can be applied to any type of sensitive data. This data security practice is most commonly associated with credit card processing, but it can also be used to secure medical records, bank transactions, loan applications, stock market trading, criminal records, social security numbers, and more.

Tokens have no real value on their own. They simply replace sensitive data while still allowing the data to serve its purpose.

Casino chips are a great analogy. People at a casino can play poker, blackjack, or other table games using chips that represent real money. But they can’t use those chips to buy groceries—the cash itself is protected by the token.

How Tokenization Works

In terms of data security, the tokenization process is a bit more complex. Tokens can be generated in the following ways:

  • Mathematically cryptographic function with a key that’s reversible
  • Nonreversible functions (like a hash function)
  • A randomly generated number or an index function

Once the tokenization process has been completed, the tokens represent a safe way to use the data—tokens are nonsensitive. The sensitive data is stored in a centralized server called a “token vault.” Token vaults are the only way to map sensitive data back to corresponding tokens.

For reversible tokens, the data is not typically stored in a vault. Instead, vaultless tokenization keeps the sensitive secure with an algorithm.

The terms tokenization and encryption are commonly associated with each other, although the two data security methods are not the same.

Encryption uses an algorithm to transform information into a non-readable format. To access the encrypted data, you need to have an encryption key.

The purpose of encryption is to protect sensitive information from anyone for whom the data is not intended. Even if someone were to access the encrypted data, they would have no way to decipher it without a key.

Tokens are just a placeholder for sensitive data (remember, think casino chips). The actual data associated with that token is stored elsewhere. If a hacker steals the tokens, they have no value—the data doesn’t reside in the same environment as the token.

To truly understand tokenization, you must also get familiar with the term “detokenization.”

As the name implies, detokenization is the reverse process that exchanges the token for its original value. The only way to perform detokenization is through the original tokenization system—there’s no other way to retrieve the original data using just the token.

Tokens work well for one-time uses, like payment transactions, that don’t need to be stored beyond their initial use. But they also work well for high-value items, such as keeping a credit card on file for recurring transactions.

Tokenization ultimately keeps sensitive data safe from both internal and external threats. In many instances, tokenization can be used as a way to stay compliant with regulations like PCI DSS, GDPR, HIPAA, and more.

Let’s take a close look at some real-world examples of tokenization in use:

Example #1: Credit Card Processing

The most common use case for tokenization is credit card processing. Businesses that process credit cards must remain PCI compliant. Otherwise, they can be hit with hefty fines or lose the right to process credit cards altogether.

According to PCI standards, credit card numbers can’t be stored within a POS system or a database after a transaction.

In this scenario, businesses can use a payment service provider that takes the card information and turns it into a token. Since the tokens themselves don’t actually contain the cardholder data, they are useless to anyone with malicious intent.

Employees, hackers, or cybercriminals attempting to steal credit card numbers won’t have any luck if they get their hands on tokens.

With credit card processing, it’s common for the tokens to be kept in a preserved format. For example, the number of characters in the token will be identical to the card number, and the last four digits of the card will be visible.

This format is useful for general business operations. A customer may ask an employee what card they have on file, and the employee could give an appropriate answer like “Visa ending in 6789.”

In addition to the storage of credit cards, tokenization is also used to validate card transactions.

When a credit card gets swiped or entered into an ecommerce site, a lot goes on behind the scenes. The card data passes through the credit card processor, acquiring bank, card network, and issuing bank before ultimately getting sent back to the merchant with an “approve” or “deny” message.

But the card number itself isn’t used as the information changes hands—the number is turned into a token. Once the token reaches the card network, they check the token vault to verify the account number before sending the token back down the line.

All of this happens in a matter of seconds and keeps the sensitive cardholder data secure as the transaction gets processed.

Example #2: Blockchain

Blockchain has been a hot topic since the explosion of cryptocurrency, although the concept of blockchain dates back decades before crypto was invented.

With blockchain, tokens become a digital representation of real-world assets. These are known as security tokens or asset tokens.

For centralized economic models, banks and financial institutions are responsible for managing the integrity of the transaction ledger. But for decentralized scenarios, like cryptocurrency, the responsibility shifts to the individual users involved with the transaction.

Tokens in a blockchain are linked back to the real-world asset. Each transaction, or block in the chain, is dependent on other transactions in the chain for verification.

Any tokenized asset in a blockchain can be traced back to the asset that it represents while keeping the data associated with that asset secure.

Example #3: Customer Data Storage Compliance

Laws and industry-specific regulations are tightening in various jurisdictions. Two common examples include HIPAA (Health Insurance Portability and Accountability Act) and GDPR (General Data Protection Regulation).

HIPAA defines how patient records must be stored and shared in the healthcare space, and GDPR is for consumer data protection in the European Union.

Let’s take a closer look at GDPR and how tokenization can be used here for compliance. Just know that the same concept can be applied to any regulation or data security practice with similar information.

The GDPR requires businesses to remove all personal identifiers associated with the customer data that they’re storing. For example, a person’s name, phone number, or address can’t be stored with their transaction history.

To remain compliant, organizations must put that data through a process called pseudonymization. Tokenization is one way to accomplish pseudonymization, as it takes the personal identifiers of consumer data and turns them into tokens.

Example #4: User Authentication

Tokens can also be used to verify the identity of users, adding an extra layer of security to access sensitive information.

For websites and applications that require a username and password, one-time tokens can be generated and stored in the browser. This provides the user access to other pages on the domain for a specified period of time without needing to re-authenticate themselves on every page.

Once the session is over, the token is destroyed. So the account remains secure.

  • Accessing an account from a single-use text or email code
  • Logging into a third-party website using Gmail credentials
  • Unlocking a smartphone or app with a fingerprint

These are all common examples of token-based authentication that we’ve all seen on a regular basis.

Example #5: Physical Asset Exchange

Art tokenization is a really unique example, but I wanted to include it to showcase the versatility of tokens. Here’s how it works.

An organization that owns a rare or valuable piece of artwork can have the work appraised to set its value. Based on this value, the artwork can be converted into digital tokens for sale on the open market.

Buyers can purchase these tokens to create a portfolio of fractional art shares and sell those tokens for profit.

Alone, the tokens hold no real value. But when authenticated through the platform managing the artwork, the token represents a fractional share of a physical asset.

JPMorgan has used this same concept to tokenize gold bars.

How to Get Started With Tokenization

Now that you understand how tokenization works, it’s time to apply this concept to your specific use case. Here are the tactical steps required to get started with tokenization:

Step 1: Identify What You Need to Secure

What are you trying to tokenize? This must be clearly defined before you continue.

As you’ve seen from the examples above, there are seemingly endless physical and digital assets that can be applied to tokenization.

Most businesses getting started with tokenization are usually trying to protect sensitive consumer data or payment transactions. You could also use tokenization to secure sensitive company data, like payroll information or employee records.

Step 2: Choose a Token Generation Method

Based on the information you’re trying to protect, you need to decide how your tokens will be generated.

First, choose between single-use tokens and multi-use tokens. Single-use tokens work well for things like one-time transactions or user authentication. Multi-use tokens would be required for long-term storage—like keeping credit card data on file or storing personal customer data.

Once you’ve narrowed this down, you need to decide how the tokens will be generated.

Are you going to use a mathematically reversible cryptographic function and key? What about a hash function that’s non-reversible? Or do you want a randomly generated number?

Step 3: Select a Tokenization Provider

Now that you have a firm grasp of your needs, you’ll need a tokenization provider to make all of this happen for you. Most tokenization providers offer a wide range of options, but you need to verify that the solution you’re using fits the criteria identified in steps one and two.

Depending on the use case, choosing a provider will be fairly easy.

For example, payment processors will offer tokenization to your business if you want to start accepting credit cards. They’ll provide you with all of the hardware and technology required to process the cards, and the tokens will be generated and processed by them on the backend.

If you’re using tokenization to authenticate users or protect sensitive business data, then you’d need to use another data security solution. The process here won’t be the same as tokenization for payment processing.

Step 4: Pick a Storage Environment

With tokenization, you need to store two different things. First, you need a place to store the tokens. But you also need a place to store the original and sensitive data.

These two cannot be stored in the same place. So if someone steals the tokens, they can’t do anything with them.

If you’re going to host the tokens in-house, you need to make sure that your environment supports this. In many cases, it’s better to just use the storage system that’s set up by your tokenization provider. They should already have the infrastructure in place to handle everything you need.

You’ll also need to decide between using a token vault or vaultless storage.

Step 5: Understand Detokenization

In many instances, tokenization is not permanent. You can ultimately exchange the token for the original sensitive data.

But this process can only occur through the platform you’re using to generate the tokens.

Let’s stick with the credit card processing example, as this is the most common use case for tokenization. If you’re keeping customer cards on file for recurring billing or faster checkouts, each time that person makes a transaction, the token must be authenticated before the payment can be processed.

Incredible companies use Nira

Every company that uses Google Workspace should be using Nira.
Bryan Wise
Bryan Wise,
Former VP of IT at GitLab

Incredible companies use Nira