Introduction
In the digital era, the volume of user data being processed is growing at a geometric rate, and so are the risks associated with leaks or unauthorized access. One effective approach to protecting this data is hashing (for passwords and certain types of confidential information) and obfuscation (or encryption) when transmitting and storing data in event streams.
Without these measures, a system can suffer not only from direct hacking but also from inadvertent leaks, for example, when logs are sent to a public repository or when Kafka access settings are misconfigured.
In this article, we will explore the most commonly used algorithms, how they look in Go, and why these steps get special attention in information security.
TL;DR
- Hashing secures passwords and personal data, making them virtually irrecoverable for attackers.
- Obfuscation (or encryption) safeguards the information in event streams (Kafka), preventing third parties from reading it in case of leaks.
- Kafka certificates and clearly defined access roles reinforce security and avert accidental leaks in the console or consumer group.
- Following gold-standard rules (a separate database for each service, migrations for any modifications, secure storage of passwords and keys, etc.) makes a system more resilient against cyberattacks.
Main Section
1. Hashing Data in the Database
Why is it important?
-
Security in the event of database compromise: If someone gains access to your database (by dumping it or obtaining root privileges), hashed passwords or other sensitive information remain unreadable in plaintext form.
-
Protecting brand and reputation: Leaking hashed credentials without the possibility of easily restoring the original data is much less damaging to a company’s reputation than leaking passwords in plain text.
Commonly used algorithms
- bcrypt: Designed specifically for passwords; it has a customizable “cost” parameter that hinders brute-force attempts by increasing computational complexity.
Short Go example (bcrypt)
package main
import (
"fmt"
"golang.org/x/crypto/bcrypt"
)
func hashPassword(password string) (string, error) {
hash, err := bcrypt.GenerateFromPassword([]byte(password), bcrypt.DefaultCost)
if err != nil {
return "", err
}
return string(hash), nil
}
hashPassword
generates a hash with the recommended complexity settings.
Short Go example (SHA-256)
package main
import (
"crypto/sha256"
"encoding/hex"
)
func hashData(data string) string {
hash := sha256.Sum256([]byte(data))
return hex.EncodeToString(hash[:])
}
This method creates a cryptographically strong hash. It’s used for other types of data not necessarily linked to passwords.
2. Obfuscation (Encryption) in Data Streaming
When a system generates events (for example, through Event Sourcing) for marketing or analytics and sends them to Kafka (or another streaming service), there’s a risk that these messages may contain personal user data (email addresses, phone numbers, etc). Obfuscation helps ensure this data won’t be accessible in plain text.
Use case: Marketing and user data
- Marketers need the data to segment the audience or evaluate user behavior.
- Each user event (registration, clicks, purchases, etc.) is sent to Kafka.
-
If the events contain unencrypted personal data, accidental leaks, or a “rogue” consumer connecting to the topic could result in serious consequences.
Best practices:
-
Encrypt the payload with a strong algorithm, such as AES.
-
Use Kafka certificates to enable secure authentication and on-the-wire encryption.
-
Restrict access to Kafka topics via ACLs (Access Control Lists) and role-based policies.
Short Go example (AES)
package main
import (
"crypto/aes"
"crypto/cipher"
"crypto/rand"
"encoding/base64"
"fmt"
"io"
)
func encryptAES(plaintext, key string) (string, error) {
block, err := aes.NewCipher([]byte(key))
if err != nil {
return "", err
}
cipherText := make([]byte, aes.BlockSize+len(plaintext))
iv := cipherText[:aes.BlockSize]
if _, err := io.ReadFull(rand.Reader, iv); err != nil {
return "", err
}
stream := cipher.NewCFBEncrypter(block, iv)
stream.XORKeyStream(cipherText[aes.BlockSize:], []byte(plaintext))
return base64.URLEncoding.EncodeToString(cipherText), nil
}
func main() {
key := "myVerySecretKey12" // AES-128/256
data := "User Email: [email protected]"
encrypted, err := encryptAES(data, key)
if err != nil {
panic(err)
}
fmt.Println("Encrypted Data:", encrypted)
}
Thus, even if someone gains access to a message, they won’t be able to decrypt its contents without the key.
3. Exceptions and Additional Measures
Despite obfuscation being crucial, certain business scenarios demand de-identified data for analytics (for example, age or country). In those cases, you need to:
- Remove personal fields (full name, exact email, phone number), replacing them with anonymized identifiers.
- Use certificates and authentication in Kafka so that no one without sufficient privileges can “join” your consumer group or read messages directly from the console.
Sample command for secure topic access:
kafka-console-consumer --bootstrap-server kafka:9092
--topic user-events
--consumer.config client-ssl.properties
The client-ssl.properties
file will contain the necessary certificate and key information for authentication.
Conclusion
By combining hashing and obfuscation, as well as properly configuring access roles and migration processes, you achieve a multi-layered defense for user data. This not only lowers the potential risk of leaks but also enhances user and regulatory confidence in your system.
Gold-Standard Rules
-
Database per service: Each microservice should have its own database to isolate breaches and prevent unwanted cross-service access.
-
Each user has a unique account: No shared accounts or passwords u2014 this simplifies audit trails and accountability.
-
Access roles clearly defined by policy: All data operations should be governed by predetermined rules (who can read, who can modify, etc.).
-
All database changes via migrations only: Avoid arbitrary manual modifications to the database schema. Migrations provide transparency, versioning, and control.
-
All passwords in Vault or k8s secrets (including the “salt” for decoding): Sensitive data should never be stored in code or exposed in Git. Use specialized solutions (HashiCorp Vault, Kubernetes Secrets, etc.).
By following these recommendations, you significantly strengthen security in your infrastructure and minimize financial and reputational risks when dealing with user data.