Safe Serialization of Machine Learning Models

At my previous job, I helped to release a serialization library for arbitrary Python machine learning models. The library prevents remote code execution vulnerabilities during model deserialization by using GPG keys to sign serialized models, and only deserializing models that were signed with trusted keys.

If you:

write machine learning models in Python and want to securely share them with others,
are interested in improving the security of your machine learning applications, or
want to learn more about code serialization or cryptography through a unique example of how they’re used in practice,

check out the original post I wrote introducing this library on Medium, or this talk I gave at a Boston Python meetup.

N.B.: Since writing the original post, I’ve learned more about security researchers’ concerns surrounding usability and poor implementations of the PGP protocol. While these issues are important to understand when using PGP or building secure applications, cryptographic verification of serialized models is a major step up from requiring users to individually verify the models they deserialize, so I would still encourage people who are interested to take a look at the original post.

Lee Bernick

Software Engineer, NYC

Safe Serialization of Machine Learning Models