Happy Friday! I hope you’ve all had a productive and fun week.
Cloud security is a neglected topic in Machine Learning but is crucial to the success of an ML System. In a well-organised business, security isn’t the remit of MLOps but should be considered. In this issue, we’re going to take a look at how you can harden your ML environment.
Looking to develop your MLOps career? We’re now offering coaching to create a career plan to help you land your dream MLOps job. Head over to the MLOps Now website to take a look and sign up. See you there!
Who put a hole in this bucket?
It’s every Cyber Security professional’s worst nightmare. A malicious actor siphoning customer data for weeks with no one noticing.
It’s even worse when it’s credit card information.
This nightmare was a reality for Capital One on July 17th 2019. When they discovered an exposed web server on AWS allowing users to any data they wanted from S3 buckets.
The breach affected 100 million customers and cost Capital One $270 million.
While the hack wasn't the fault of an ML system it's a reminder to protect ML infrastructure.
If you want to learn more about the Capital One hack check out this great video by Kevin Fang.
The cutting room floor
Data Science teams treat data a bit like the cutting room floor:
-
Extract a sample from an original source (e.g. database, other files, online, APIs)
-
Trim and edit it to use for modelling
-
Train with it
-
Forget about it for the next shiny dataset
This process leaves shared development environments littered with samples.
Restricting this process isn't the right choice as it is part of experimentation. But who can access this "cutting room floor" is important to consider.
Model endpoints are another vulnerability that needs consideration.
Accepting all requests allows bad actors to hammer the endpoint, creating issues in production. Also, feeding in false data can disrupt Continuous Training, delaying model improvements.
Don’t let your cloud rain data
When securing a Machine Learning system there are 5 things to consider:
- User access
- Network access
- Protecting information at rest
- Endpoint protection
- Engaging Cyber Security
User access
By default, ML services like Sagemaker create user roles that are too open for accessing data.
IAM roles & policies help restrict unauthorised access to data or deployments. This is more powerful when combined with network access (next point).
User access is the foundation of protecting ML systems. IAM roles & policies can separate users into appropriate tiers of access, minimising security risks.
An example of a tiered approach would be:
-
Data Scientists - Able to bring data in (write) and access it (read) but not allowed to move it from the environment. Appropriate access to ML tooling (e.g. JupyterLab).
-
MLOps Engineers - Perform the same actions as Data Scientists but have the ability to trigger deployments.
Network access
Virtual Private Clouds (VPCs) segment infrastructure to restrict network connectivity. This is a powerful way to restrict movement of data.
Placing your development, staging and production in VPCs helps prevent data leakage. Using AWS PrivateLink you can create an internal connection between VPCs. Keeping them off the internet, ensuring data is hidden from the outside world.
Combining VPCs with IAM adds another layer of security. VPCs can restrict who can access them based on their IAM role and also their IP, perfect for enforcing company VPNs.
An important and easy setting to apply is setting buckets to not be publicly accessible.
By default, buckets like in S3 are accessible over the internet. Combine this with VPCs and you've made a good start in hardening data access.
Protecting information at rest
Another super simple way to protect data is to encrypt it at rest.
Applying an encryption key restricts access to those who have decryption keys. This minimises damage as without the decryption key the data is unreadable.
Endpoint protection
If using models internally then there is no need for exposure to the internet.
Protect your endpoints by placing them in a VPC. Setting this VPC to be accessible only by internal IPs improves security. To get access to your endpoint an attacker would first have to breach your network.
An extra layer of security is to use an API gateway. A gateway can be configured to accept certain IAM roles and rules and reject anything else.
Engaging Cyber Security
Security is an important part of MLOps but MLOps Engineers are not the authority on the best way to do it. Loop in your Cyber Security team early and often to get their guidance and green light.
Life is much less painful when you engage Cyber Security early on.
Having their guidance provides more confidence and less risk of rearchitecting later on.
That’s it for this week. Don't forget:
- Let me know your thoughts. How does your comapny consider security for Machine Learning?
- Be sure to share this with colleagues and friends. Make sure they sign up as well!
- If you haven’t already, follow me on Twitter and connect on LinkedIn.
See you next week,
Huw |