Understanding Safetensors

2025-07-07

6 min read

Introduction

In the world of machine learning, how you store your models matters just as much as how you train them.

Enter Safetensors - a modern file format that’s changing how we handle model weights and embeddings. But what makes it special, and why should you care?

A bit of a background: When I started exploring options available to run my own local LLM setup, and quite often I noticed warnings similar to the one below.

vae/diffusion_pytorch_model.safetensors not found
safetensors not found, falling back to .bin format

I got quite intrigued and wanted to know about this. Here is what I understood:

What are Safetensors?

Safetensors is a file format specifically engineered for the machine learning community. It provides a secure and efficient way to store model weights, embeddings, and other large binary data.

While formats like PyTorch’s .pt and TensorFlow’s .h5 have served us well, Safetensors addresses their key limitations, particularly in security and performance.

The Pickle Problem

The .bin files you often encounter in machine learning are typically PyTorch models serialized using Python’s pickle protocol. When you see warnings like “falling back to .bin format,” it means the system is reverting to these pickle-based files.

Security Vulnerability: Pickle can execute arbitrary Python code during deserialization
Python-Only: Limited to Python environments, no cross-language support
No Validation: No built-in checks for data integrity or format verification
Opaque Format: Binary blob with no readable metadata or structure

# This innocent-looking .bin file could contain malicious code
import pickle

# Loading a .bin file actually unpickles Python objects
# This can execute ANY Python code embedded in the file!
model = pickle.load(open("model.bin", "rb"))

This is why the ML community needed a safer alternative - enter Safetensors.

Key Features

Enhanced Security

Zero Code Execution: Unlike Pickle (used by PyTorch), Safetensors prevents arbitrary code execution during loading
Safe Deserialization: Eliminates risks associated with loading untrusted data
Predictable Behavior: No hidden side effects during model loading

Optimized Performance

Fast Loading: Optimized for rapid access to large tensor data
Memory Mapping: Efficient handling of large models through memory-mapping techniques
Binary Format: Compact storage with minimal overhead

Framework Flexibility

Cross-Platform: Works seamlessly across PyTorch, TensorFlow, and other ML frameworks
Universal Compatibility: Standardized format for easier model sharing
Deterministic Loading: Consistent results across different environments

Why Choose Safetensors?

Security Benefits

When downloading pre-trained models from the internet, security is paramount. Safetensors eliminates the risk of malicious code execution that could be hidden in model files.

Performance Advantages

# Traditional PyTorch loading
model = torch.load("model.pt")  # Potential security risk

# Safer alternative with Safetensors
from safetensors.torch import load_file
model = load_file("model.safetensors")  # Secure and efficient

Industry Adoption

Major platforms like Hugging Face have embraced Safetensors as their preferred format, making it increasingly important in the ML ecosystem.

Comparison with Other Formats

Feature	Safetensors	.bin (Pickle)	PyTorch (.pt)	TensorFlow (.h5)
Security	✅ High	❌ Low	⚠️ Medium	⚠️ Medium
Load Speed	✅ Fast	⚠️ Medium	✅ Fast	✅ Fast
Cross-Framework	✅ Yes	❌ No	❌ No	❌ No
Code Execution Risk	❌ None	⚠️ High	⚠️ Possible	⚠️ Possible
File Size	✅ Compact	⚠️ Variable	⚠️ Variable	⚠️ Variable
Metadata Support	✅ Yes	❌ No	⚠️ Limited	⚠️ Limited
Memory Mapping	✅ Yes	❌ No	❌ No	❌ No

Security Deep Dive

The Pickle Vulnerability: To understand why .bin files pose such a risk, consider this example:

# malicious_model.py - A seemingly innocent model file
import pickle
import os

class MaliciousModel:
    def __init__(self):
        self.weights = {"layer1": [0.1, 0.2], "layer2": [0.3, 0.4]}

    def __reduce__(self):
        # This code runs during unpickling!
        return (os.system, ('echo "Your system has been compromised!" > /tmp/hacked.txt',))

# Save as .bin file
with open("model.bin", "wb") as f:
    pickle.dump(MaliciousModel(), f)

# When someone loads this "model"...
# pickle.load(open("model.bin", "rb"))  # Creates /tmp/hacked.txt!

The security issue in the provided code is arbitrary code execution during the deserialization of a pickled file. This is a well-known and critical vulnerability associated with Python’s pickle module.

When you load a seemingly harmless file, such as a machine learning model, with pickle.load(), it can execute malicious commands on your system. This happens because pickle is not designed to be secure against maliciously crafted data.

How the Attack Works

The vulnerability is exploited through the reduce method. In Python, the reduce method is used to tell pickle how to serialize an object. When a pickled object that has a custom reduce method is deserialized, that method is executed.

In the provided example, the MaliciousModel class has a reduce method that has been deliberately altered for malicious purposes:

def **reduce**(self):
    # This code runs during unpickling!
    return (os.system, ('echo "Your system has been compromised!" > /tmp/hacked.txt',))

Here’s a breakdown of what happens:

1. Overriding reduce: The attacker creates a class (MaliciousModel) with a custom reduce method.

2. Specifying a Malicious Command: This method returns a tuple where the first element is a callable function (os.system) and the second element is a tuple of arguments for that function ((’echo “Your system has been compromised!” > /tmp/hacked.txt’,)).

3. Serialization: The attacker saves an instance of this class to a file (model.bin) using pickle.dump().

4. Deserialization and Code Execution: When a victim loads this file using pickle.load(open(“model.bin”, “rb”)), the reduce method is called. This results in pickle executing os.system(’echo “Your system has been compromised!” > /tmp/hacked.txt’), which creates a file named hacked.txt in the /tmp/ directory.

While this example command is relatively benign, an attacker could just as easily provide a command to delete files, download and execute malware, or create a reverse shell to gain full control over the system.

Getting Started

Installation

pip install safetensors

Basic Usage

from safetensors.torch import save_file, load_file

# Saving your model
state_dict = model.state_dict()
save_file(state_dict, "model.safetensors")

# Loading your model
loaded_state_dict = load_file("model.safetensors")
model.load_state_dict(loaded_state_dict)

Migrating from .bin to Safetensors

Converting Existing Models

If you have models in .bin format, here’s how to convert them:

import torch
from safetensors.torch import save_file

# Load the .bin file (be careful - only from trusted sources!)
model_state = torch.load("model.bin", map_location="cpu")

# Save as safetensors
save_file(model_state, "model.safetensors")
print("Conversion complete! You can now safely delete the .bin file.")

Batch Conversion Script

For multiple models:

import os
import glob
import torch
from safetensors.torch import save_file

def convert_bin_to_safetensors(directory):
    bin_files = glob.glob(os.path.join(directory, "*.bin"))

    for bin_path in bin_files:
        try:
            # Load .bin file
            state_dict = torch.load(bin_path, map_location="cpu")

            # Create safetensors path
            safe_path = bin_path.replace(".bin", ".safetensors")

            # Save as safetensors
            save_file(state_dict, safe_path)
            print(f"✅ Converted: {os.path.basename(bin_path)}")

        except Exception as e:
            print(f"❌ Failed to convert {bin_path}: {e}")

# Usage
convert_bin_to_safetensors("./models")

Hugging Face Models

For Hugging Face models, use the built-in conversion:

from transformers import AutoModel

# Load model (will use .bin if safetensors not available)
model = AutoModel.from_pretrained("bert-base-uncased")

# Save with safetensors format
model.save_pretrained("./my_model", safe_serialization=True)

Real-World Use Cases

Production Deployments
- Secure model serving in production environments
- Safe handling of third-party models
Research Collaboration
- Sharing models across institutions
- Reproducible research with deterministic loading
Model Distribution
- Safe distribution of pre-trained models
- Efficient storage in model hubs

Conclusion

Safetensors represents a significant step forward in secure and efficient model storage. Whether you’re working on production systems or collaborative research, its combination of security, performance, and ease of use makes it an excellent choice for modern ML workflows.

The official Python documentation strongly warns against unpickling data from untrusted sources. To mitigate this risk:

Never unpickle data from a source you don’t trust.
Use safer serialization formats like JSON for simple data structures, or safetensors for machine learning models, which do not have the ability to execute arbitrary code.
If you must use pickle, ensure the integrity of the file by verifying its source and using cryptographic signatures (e.g., HMAC) to check for tampering.