Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Don't use ENV variables for secret data (2017) (diogomonica.com)
110 points by cimnine on July 13, 2020 | hide | past | favorite | 140 comments


So the author offers two alternatives:

1. Using docker-secret inside of a Docker swarm

2. Using Keywhiz [1], a Java server together with a FUSE client.

This seems overkill for a lot of cases. If environment variables are such a security problem, why not just use a config file (not checked into the source code repository) with proper permissions set?

[1] https://developer.squareup.com/blog/protecting-infrastructur...


It reminds me a lot of .Net Core's newish "Secret Manager."

They created a user profile storage vault that's outside the deployment/source code path (like ENV variables), and then for some inexplicable reason tell you not to use it in production without good justification anywhere and to use their paid service instead ("Azure Key Vault" $3/100K requests) which is multiple extra points of failure (even ignoring Azure's reliability problems).

Naturally people will repeat Microsoft's advice verbatim without justifying it themselves like this SO answer[0]:

> Don't use app secrets in production. Ever. As the article says DURING DEVELOPMENT.

But WHY?! And the article they linked doesn't tell you WHY either, just points to a paid service. And when these people get poked for an explanation, they just wrap the same secrets in another layer of abstraction, but really haven't changed the security of the operations they're performing.

For example if a server node gets compromised and that node is authorized to make requests to "Azure Key Vault," it too can request the keys. The abstraction may make sense for public-private key scenarios where the actual private certificate is never returned, but a lot of what the "Secret Manager" returns are raw database credentials and encryption keys, making this paid abstraction more beneficial for centralized management than actual security.

If people want to argue for more abstraction: Fine. But they have to explain the logic behind the security.

[0] https://stackoverflow.com/questions/39668456/how-to-deploy-a...


Another reason is that you have a single place to store your secrets shared between multiple services. Now you only have one place to update when the secret expires/is revoked.


If I understand correctly, the primary advantage of using a secrets manager is that you can log all of the requests.

If your security is compromised, it's compromised, but having at least some kind of auditing can be hugely beneficial from a security standpoint.


https://12factor.net/

Config is the third commandment. IMO it also makes it super easy to swap configs for different environments.


Did you read the article? Author knows about, mentions it in first sentence and still is against putting secrets there.


> not checked into the source code repository

Or checked-in, for easy distribution, just encrypted:

https://github.com/sobolevn/git-secret


I mean anything encrypted needs to be decrypted, meaning you have to... have the key stored in an environment variable on the server?


This would at least alleviate having to know n different env secrets that need to be set.


Yeah, it’s convenient but it doesn’t solve the topic of the post, which is “you shouldn’t use environment variables for secret data”


Install the secret for production. For testing you just lock/unlock the secret.


Or in a file, like ansible does it.


I don't agree at all. The reasons in the article all seem like "envs are bad because if you make a mistake you can expose them". This is not exclusive to envs, it applies to all secrets, independent of the medium used to make it available to the process using it.

In my experience, if you prevent using envs for secrets (as docker swarm does) all you get is a disgruntled programmer reading the contents of a secret file to an env in the entrypoint.


I think what the author means is environment variables are particularly vulnerable to being logged by accident, because:

1. They're stored right next to variables like PATH, JAVA_HOME, LC_ALL and PYTHONPATH which people might plausibly decide to log out every time

2. They'll get printed any time someone writes a shell script with set -x then uses the environment variable.

3. They'll probably end up in your developers' ~/.profile or ~/.bashrc, meaning any program logging the environment will log it, not just your program

4. Because they'll be in ~/.profile or similar, the secret will be in a file on disk anyway and a secret that's in one place is always better than a secret that's in two places.

With that said, a lot of CI servers that support "secure variables" offer those as environment variables and nothing else. So I can understand why people might end up stuck with environment variables despite their downsides.


This all seems like a lot of work for a problem that seems largely theoretical. How often do Fargate or Lambda function VMs get broken into in the first place? I could see this maybe being a concern if you're running big, long-running VM instances with lots of collocated services (as well as utilities for process management, ssh support, log exfiltration, package management, etc and all of the extra drudgery/attack-surface you have to manage yourself when you opt out of serverless).

Also, application secrets for development environments probably shouldn't be super sensitive in the first place, right? For example, for a third party API key for a service like Auth0, we would have a dev tenant within Auth0 so even if a developer's environment is compromised, it can't jeopardize production.


It's really the latter half of your fourth point. Many programs log envvars for either debugging or intrusion detection purposes just like they run command invocations. In a multiuser environment this can be problematic, especially in a business with multiple security areas. E.g. you don't want IT (that can read the logs) getting access to production resources in a hard to track way (by reusing a production key instead of their more closely monitored access system).

I tend to stick to files, it's just more convenient and reusable. But if you're on a single user system don't worry about passing things via the command line or via envvars.


>1. They're stored right next to variables like PATH, JAVA_HOME, LC_ALL and PYTHONPATH which people might plausibly decide to log out every time

No, they aren't. You just create a .env file in your project root and run "source .env" before you run your application.

Unlike Windows there are no global environment variables on Linux. All of them are hierarchical and only exist within the process they were created and its children.


> 1. They're stored right next to variables like PATH,

So what? If the problem is that people might accidentally log secrets then the problem is not env variables.

> 2. They'll get printed any time someone writes a shell script

The set -x flag is a debugging flag to print out traces of a shell script being executed. This flag is disabled by default. Why is this far-fetched example being used to justify not using env variables?

> 3. They'll probably end up in your developers' ~/.profile or ~/.bashrc,

No, they don't. With containers you may launch devel env variables some way or another (env files, setting up the IDE, setting env variables manually) but at most they are a part of a testing config. The env variables used in production are handled by the deployment.

Even if you're able to sniff a env variable used in development, that has nothing to do with the production service.

> 4. Because they'll be in ~/.profile or similar, the secret will be in a file on disk anyway

This complaint makes no sense at all. No container orchestration system stores env variables in ~/.profile although their containers use env variables For secrets extensively. The container orchestration service provides secret-management services, including setting up the env your containers are launched, and all you need to care about is that your container needs to accept certain env variables that you will use on your app.

Even if you deploy services on bare metal, you should still use env variables for config and secrets, but you need to be able to manage secrets and rotate keys with a non-pet method.

> So I can understand why people might end up stuck with environment variables despite their downsides.

I'm sorry but you failed to both present any downside of using secrets and point out any scenario from the real world. I mean, anyone who ever read any intro tutorial on how to deploy a system is well-aware that you should not store passwords and keys and secrets in plain-text files. That has nothing to do with using env variables at all. In fact, all your examples are entirely oblivious to the standard workflow of deploying a service, moreso if it involves a container orchestration system.

So if the examples don't have any relation with basic practices and real-world workflows, why should anyone avoid using a best-practice?


Also mentioned in the article, they pass down to forked processes by default.


There are actually key/value stores that solve this securely, such as Hashicorp Vault.

The issue isn’t that it can’t be done but more that most people either don’t already know it can be done or don’t want to invest in the infrastructure to do it.

Regarding the latter point, for self hosted solutions I can sympathise a little and it’s really a question of risk analysis. But most cloud computing services do offer their own secrets management service.

(not affiliated with Hashicorp and other services exist).


The problem with Hashicorp Vault (and their peers): Your application still need a secret to access values made available to your application's role.

The values might not be in the immediate container space (well, aside from being in program memory), but they're only one (likely well documented internally to the container) hop away.


> The problem with Hashicorp Vault (and their peers): Your application still need a secret to access values made available to your application's role.

True but those credentials can be decoupled from the application (like env vars are) so you satisfy the developer problem I was addressing.


I've constantly tried to figure out the answer to this. Is there literally any solution to this that doesn't involve the access key for the secrets vault being supplied by human input / secure hardware? But even in the case of secure hardware, if the hardware trusts the requesting application and that application becomes compromised, doesn't that defeat the purpose? Where is trust anchored?


Sure, and they are great. But in some cases, it's inevitable to read some secrets from the secret management service to envs. This is what docker swarm doesn't allow with the 'docker secret' command


Great :-(

Had to read to the end of the description of why environment variables are bad to discover that it is effectively an advertisement for Docker. I don't use Docker so the article told me pretty much nothing that wasn't fairly obvious already, although it is a valuable reminder.


This is not Docker related.

If an application spawns a sub-process, that sub-process will inherit all environment variables. Which might be fine or might not be, e.g. if the spawned application is user controlled.

Also tools such as Airbrake or Sentry often send all your ENV variables to the error collection server, effectively exposing your secret values. Most such tools offer to filter variables, but that's in my experience almost always not done pro-actively.

The only thing that's Docker related in that post is that it does offer a turn-key solution for Docker-based projects. The solution's principle can be applied to other projects though, i.e. secrets should be read from a config file (or something like Vault).


> If an application spawns a sub-process, that sub-process will inherit all environment variables. Which might be fine or might not be, e.g. if the spawned application is user controlled.

Right. But isn't that well known?

Of course you have to explicitly define which environment variables are passed through to the process you are going to spawn, just as you would have to drop permissions to (configuration) files and possible limit capabilities if you start spawning untrusted processes.

IDK... For me it make sense that you should not give secrets over the command line, because they will appear in the program listings, but environment variable is pretty much ok in many cases.


> Also tools such as Airbrake or Sentry often send all your ENV variables to the error collection server, effectively exposing your secret values.

That only happens if whoever deployed those services screwed up badly and failed to configure any kind of filtering.

I mean, airbrake specifically refers to their filtering system as best practices to avoid leak sensitive data.

https://airbrake.io/product/security

You're not commenting on a problem with env variables. You're commenting how poorly deployed and configured logging services can leak secrets.


> If an application spawns a sub-process, that sub-process will inherit all environment variables.

...“by default”, specifying the environment is something you do when creating a new process.


I wonder why you would run a user supplied application without sandboxing it (reset env, user with almost no permissions, ...)


I am progressively adopting Hashicorp Vault as the secrets manager of choice. It can be used by a variety of different scenarios -- directly into the application using AppRole, with Terraform using it's secrets provider, by developers, during vault authentication when they get their secrets and access regenerated.

This way I am not bound to docker swarm, or keywhiz, or god forbid AWS Secrets Manager.

As of now, I am still exposing secrets with Env Vars, but the next step is to use Vault directly. Vault has been pretty reliable so far. It is using AWS KMS for managing the master key and a scalable DynamoDB table for high availability backend.


Vault looks cool, but looking at the reference architecture [1] my guts tell me it's much easier to fuck up setting up a Vault cluster than environment variables.

[1] https://learn.hashicorp.com/vault/operations/ops-reference-a...


If you are in AWS, you can use S3 for data storage and DynamoDB for HA. Our first install was set up using their 'best practices' and consul, being stateful, is frustrating to run in kubernetes. We migrated to using S3/Dynamo and now have fewer moving parts and haven't seen any issues.


Thanks for the advice!


You can always use their official Terraform module for AWS if you are not comfortable with setting it all up by yourself: https://registry.terraform.io/modules/hashicorp/vault/aws/0....

But if you cut Consul for backend, if you are not using consul for other service discovery and Nomad etc..., you can simplify that deployment a whole lot. You make sure you open the cluster communication port, setup a Application Load Balancer in front of the cluster to balance traffic and serve SSL, configure auto-unseal using AWS KMS (since you are not using it too often, 1$ is OK to have AWS manage your master key), deploy Vault on every cluster instance, and use something managed like DynamoDB as your backend. I think this is a pretty simple yet scalable setup. The amount of cloud lock-in is pretty minimal and can easily be replaced with HAProxy setup.


What's wrong with AWS Secrets Manager?

If I'm already working 100% in Amazon, I'm tempted to use Secrets Manager rather than justify the cost in hours to deploy and maintain a Vault cluster.

Interested in your opinion.


Not OP, but I've recently worked with AWS Parameter Store a lot, and I've been very happy with it so far. We store all configuration (including secrets) using it.

We started by using https://github.com/segmentio/chamber/, but because of the way we decided to structure our secrets we decided to write a clone of it ourselves (https://github.com/micvbang/confman-go).

In practice, we have a client written in Python that grabs configuration from Parameter Store and puts it into the process' ENV variables at startup. This allows us to avoid vendor lockin in the rest of our code, since we still just fetch all config from the environment. I like it so far :)


IMO, it gets expensive quickly, and it was designed around AWS' use case, which involves credentials that roll on a regular basis. Outside of RDS, its value for the price goes downhill quickly.

To a sister comment who mentions parameter store - that UI is the biggest leaking bag of horse manure I've ever had the displeasure of using. We made the mistake of using it, and moved over to Vault at the first available opportunity. Vault isn't a silver bullet either, but the UI is at least usable.


The web UI is bad, but tools like chamber (https://github.com/segmentio/chamber) are excellent, so there's no need to deal with the web UI.


I assume you mean Parameter Store's web console UI? I've never really used it; we use only API endpoints and command-line tools, and it works great.


Cloud lock-in (not being able for example to manage cross-platform credentials) and cost.

By itself AWS Secrets Manager is fine, but as with everything AWS, billing is pretty opaque. If you couple this with a poorly planed Lambda setup, costs can potentially uncontrollably scale.

For what you are paying, AWS Secrets Manager seems too barebones/rigid.


Lock in shouldn't be too bad. It's create, get, set, list, delete, describe. I imagine the API footprint isn't much different if you switch over to something else. Most apps will just use get().


It shouldn't, but it can be very bad. And with vault you potentially circumvent that and end up with a tool that is way more flexible than Secrets Manager.


> This way I am not bound to [...] keywhiz

No, but you’re bound to Vault?

I agree that Vault is likely the best solution to this (dynamic secrets are super useful), but “not being coupled to something” is not a benefit you get from it.


You are right, I think that was a little harsh on my part. I can't really say much about keywhiz, I should have left it out of my comment.

Yes, you are tied to Hashicorp Vault once you adopt it, but it isn't much of a problem because you can easily redeploy it in a different cloud.

It might be true to keywhiz, so my comment wouldn't be valid to this point.


How do you share the token to access Vault to your code?


Using AppRoles and consul-template. And if you're on Kubernetes, you can use https://github.com/sethvargo/vault-kubernetes-authenticator in an init container.


And how does k8s pass that value to the app?


You can use consul-template to generate a config file that lives in-memory (volumes.myvolume.emptyDir.medium = InMemory in a PodSpec).


Also this pattern makes path traversal vulnerabilities (a thing not uncommon in web frameworks) have the potential to allow for privilege escalation on Linux via the /proc/self/environ file.

I've been on a pentest where a recently disclosed path traversal bug in Rails was not patched in the environment I was testing and I thought I would get at least some credentials from at least one service, but every host used a dedicated API for secret retrieval and there was nothing sensitive exposed via any system.

Maybe your threat model doesn't care, just adding a data point.


Any chance they documented that setup publicly? Would be interested to dive in to how all that works and gain any new insights


It's a large company that built their own bespoke internal credentials service running over a TCP port to the application with another proprietary protocol to push key material to hosts.

Can't say much more due to NDAs.

Edit: this service handles credentials and rotation for hundreds of thousands to millions of hosts.


!

Wild.

I guess with custom protocols there’s not much that can be learned from that setup. Presumably wrap the request in security, handshake to verify authority, use the custom protocol to deliver the secret which is also wrapped in security.

Too bad they keep it close to the vest, but I certainly don’t begrudge them for it.


I mean the same company built a service with better architecture that they sell as part of their managed computing environment options.

Some people complain about not wanting to use it due to "lock-in."


If you think that putting secrets in ENV is bad, you probably shouldn't take the advice of running 'docker service create --secret="secure-secret" redis:alpine' either!

Putting secrets in a command line makes them just as visible, in fact more so, to other processes. It also makes the secrets available to anyone regardless of permissions, since everyone can monitor the currently running processes and their args (e.g. a 'ps waux' or 'cat /proc/12345/cmdline')


`secure-secret` is not the secret itself. It is the name of the secret.


Oops, my bad. I saw the '--secret="secure-secret"' part of the command and assumed that was the secret itself! Thanks for the correction.


Aren't these equivalent, given that the /proc/x/environ file exists?


/proc/x/environ is similar, but has more restrictive permissions: user-readable only, whereas /proc/x/cmdline is world-readable.


Fair point.

Though I feel like in any modern app deployment scenario that isn't going to be a meaningful defensible boundary.

The answer is to use neither, especially given the number of times I've used a path traversal vulnerability to expose /proc/self/environ on a pentest, and the one time I was frustrated on a pentest when every app in the environment used a dedicated API for secret retrieval and the path traversal vulnerability I had access to was worthless.


How do docker secrets solve the problem then? According to the article they store the secret in a file as well. You could access the docker secret just as easily as environment variables. The only meaningful difference is that /proc/self/environ will return all secrets at once meanwhile with docker secrets you have to know the file path.


They don't

There's a lot of bad advice on secret storage that is Not Based On A Threat Model™


they wind up in your bash history this way too


The proposed solution appears to be "use Docker", which is a bit lame.


Every production product I’ve ever worked on, the entire team put database credentials in the environment variables.


The author's arguments aren't compelling. Most of his points can be solved by properly configuring any logging processes to grab only what's necessary and not letting newbies commit unreviewed code.


But the author's argument is that people don't do that. You're pretty much saying "The argument for seatbelts is not compelling. The problem can be solved by driving carefully."


No, I would be saying that wearing a seatbelt improperly is obviously dangerous and ill-advised.


It's way better than hard coding them into the code.


Why is that?

Also, as an aside: The very premise of plaintext credentials for computer-computer database connections always seemed strange to me. Maybe I'm just not knowledgeable enough here, but I wish the standard for database credentials was key-based.


If you put secrets in your codebase, they're now on x machines, where x is the number of developers on your team, plus their old laptops they gave away to family members and forgot to wipe, plus Backblaze because one developer doesn't have git repositories excluded from their backup settings, plus GitHub because that's where your repo is hosted.

If you don't store secrets in your codebase, they're just on ~one machine: the server hosting your application.


Every server hosting your application. Things get trickier as systems scale.


Lots of reasons:

- passwords would end up in version control repositories. A whole article could be written on this point alone but to summarise: those credentials will then be in your projects history forever more (or until nuking the history becomes more important than keeping the history)

- you can’t then change the credentials easily without having to push a new version of the application

- you expose the password to developers, which might be fine in smaller teams but larger organisations might separate those duties. You might also bring in contractors who you wouldn’t trust with DB access or shouldn’t have access for data compliance reasons

- you make it harder to have different credentials for different environments (Eg dev, UAT, staging and production). There is no workaround for that doesn’t introduce other problems....aside from removing hardcoded passwords.


Well,

2/ you probably can't change the credentials just by changing the ENV anyway: there will likely be some kind of restart/reload to perform on one or many components (and such actions better be well logged and tracked, which happens with a redeploy)

3/ your production DB shouldn't be accessible directly with the password, otherwise you have a bigger problem

4/ it's not harder, it's just an if/switch away from you (instead of another set of tools)

It's good to have secrets managed, like API keys, private keys, ... but most of the time it more of hiding them, which is not sufficient! And as the article says, it is very easy to let them slip through logs/dump, as well as let some code treat them in a insecure (or even malicious) manner.


2. A service restart is always going to be less risky than shipping a new application (which would also require a restart as well).

3. In an ideal world I’d agree with you. However in many places it is and even in places where it’s not, not hardcoding passwords helps provide defence in depth.

4. Harder to do it securely. Your solution leaks all passwords to all environments and is possibly the worst workaround to the problem (I’ve seen CI/CD inject the correct passwords, which seems the best approach but if you’re going to those lengths then you might as well use a secrets management service)


>those credentials will then be in your projects history forever more (or until nuking the history becomes more important than keeping the history)

With Git, it's not "nuking" so much as "retroactively creating a different timeline in which the secret wasn't shared." Still a pain in the ass if there's more than one developer.


Because if I give an external developer access to the source code for developing, I won't expose internal data and systems.

Bigger companies also may have different compliance restrictions which means developers don't get access to production, only the Administrators.


It is pretty well supported on many common databases.

A few years ago, I added short lived, auto-rotated certificates (signed by our internal CA) to all of our applications. We used these for mTLS for internal app to app calls.

I also wanted to use them to authenticate to the databases. MySQL and Oracle support this sort of authentication just fine. The obstacle I ran into was trying to explain to the DBAs what a certificate was and why it'd be a good idea to use it instead of having them manage user names and passwords for us. They decided it was too much work and stone walled. I eventually gave up and moved on.


> Why is that?

You don't want to accidentally commit your credentials to github and have the world see them. At least if they're in ENV they stay private as long as your environment does.


What about the global .gitconfig?

One of the basic key value pairs is the GPG signingkey, which is usually stored in cleartext in the aforementioned file. Although the credential.helper is in my Keychain (iCloud-backed 2FA)

In theory someone could copy this and try to sign commits as me, but I have to think this value is unique and they would get rejected if they tried to use it.

My login credentials are 2FA as well, at least on unknown machines, so they would be prompted there as well.

Personal Access Tokens for the CLI would be another way to prevent nefarious things from happening.


Now every developer has access to any db credential that was in source control. If your project has had hundreds or thousands of developers that is a security concern.


If your database is accessible by everyone, then this is your security concern.


Surely you are missing the point on purpose aren't you? Authorization happens through secrets because it is trivial to spoof anything that is not a secret. If you have a party where only known celebrities are allowed to attend then I can just send a double. That same logic applies to two independent servers and creating a server that is mirroring another is much easier than finding a double. The only way to distinguish between a fake and the original is by looking at things the original doesn't share publicly because the double cannot replicate things it doesn't know. Ok, I hope this explanation was good enough to make you understand that authorization should be based on secrets because they cannot be replicated while kept secret.

So "if your database is accessible by everyone" then it must be because the secret is accessible to everyone. However, this is where you start to contradict yourself. You suggested that delivery of the secret is not a problem because "3/ your production DB shouldn't be accessible directly with the password, otherwise you have a bigger problem". As we have established, anything that isn't a secret can be replicated and therefore should never be relied upon. If you store passwords to your database in the source code it's likely that you are also storing other secrets such AWS keys or the password for the firewall dashboard in an accessible repository. Now an attacker is behind your firewall and can access the database anyway.

I don't understand why you seem to defend storing passwords in source code by arguing about an irrelevant detail. Even if you are completely uninterested in securing your application you would still store the passwords outside of the source code simply for convenience. When you have a production, staging and development environment then each environment will need a different password. The logical conclusion would be to make the password configurable just for this alone. 90% of the features that something like Vault provides are actually more about convenience and password management than actually increasing security. Most of the security benefits come from the fact that it encourages responsible handling of passwords, not from the fact that the software itself is more secure.


if you hard code the credentials you can’t tumble them without a complete rebuild. Also, and more importantly if you check in your code those creds are now in source control and should be considered compromised.

Injecting via secrets allows us to tightly control where the secret goes and who has access as well as make it easy to tumble.


This is commonplace for most people when they start working on a new project, including myself. And sometimes when you are rushing through to meet some deadline it slips through and you end up realizing a lot later. In order to tackle this, I've developed the habit of adding a middle layer in the code between the configuration (whether it be ENV variables or config files) and the actual application. So when you eventually have time or realize that you forgot about it, swapping/fixing it isn't an issue.


We have recently started removing credentials in env vars and started using google secret manager (previously berglas) and its been amazing so far. AWS and azure should have the same. More challenging when you don’t have all the infrastructure and services in place (ie. when working on plain VPS or other systems without those tools)


Hm not sure I understand how google secret manager relates to berglas to be honest, thought those two were separate apis...

As for berglas itself, we also use it and have been very happy with it. Since you put just the names of your secrets into the ENV files, not the secrets themselves, they can be easily stored in version control, passed around in chat and you can just do whatever you want with them. Instead of:

    ENV_PASS=my-secret-pass
You do:

    ENV_PASS=berglas://bucket/secret-id
And it will be decrypted at the last possible moment - e.g. when the system starts. Or even later if you need to, if you use the apis provided.

Funny enough we had implemented the almost the same approach with AWS SSM apis ourselves (https://github.com/ovotech/ssm-env-secrets). But I think it should be possible to use berglas in AWS directly without issue.


Amazon has a similar offering called Secrets Manager which can be used for sensitive secrets as well as configuration values. https://aws.amazon.com/secrets-manager/


For those in the comments suggesting to use another system to store credentials instead of ENV:

Do you need a password to access that system? Why not?


Are secrets specific to docker-swarm? If you're using plain docker, I imagine this wouldn't work?

k8s has its own secrets system built into deployments. I haven't used it though. I've been at shops that use k8s+vault, and other places that uses marathon/DCOS+consoul.

If you're on AWS you can use pod2iam in a k8s cluster and then use the SSM parameter store to encrypt/decrypt secrets based on pod roles. I'm sure Google Cloud must have similar services.

The most agnostic way would be to mount in a file or volume at runtime. It's still accessible to the process, but just via the filesystem and not via environment variables. You still need to program with security in mind, but it's less likely for inadvertent leaks; basic layers of security. From there you could use something that encrypts that mount at rest and decrypt it when you start the container.

> Environment variables are passed down to child processes, which allows for unintended access.

Doesn't this depend on how you create the new process? fork() would keep a copy of the env in both parent/child processes and exec would keep the env because it replaces the current process. But if you start a process using something like the subprocessing module in Python, it would give you a fresh shell for that process, right?


The way I got around this was to store secrets in Google KMS encrypted files in Google Cloud Storage. The KMS key and encrypted files share the same name and can be accessed by that name programmatically. This secret storage method works really well for me and lets you easily access & manage secrets across all environments. It's so convenient, I sometimes even use this system as a simple key/value store.


That seems nice, but won't you be pwned the day that GCS accidentally loses your files, or your entire Google account gets locked because its algorithm mistakes it as a bot? The danger of that sounds greater than the possibility that your environment variables get leaked.


Neither of those scenarios quite applies to the cloud side of Google.


GCS seems like overkill, not sure why the op went that route. You can just store the encrypted secrets in the code and decrypt them with kms at start up.


They get backed up


I do this too and it's so easy to manage all your local and remote environments from one place, I go one step further and push them to my environment variables too rather than use a dotfile


Why not use KMS for storing keys directly?


Not sure if it's still like this, but I got in on KMS early on and I think you could only store Google generated keys. However, I needed to be able to store API keys, passwords, etc... So I used the KMS generated keys to encrypt GCS files who themselves contained the API key, passwords, etc... that needed storing.


I came across a blogpost describing this workflow recently and I'm curious to hear HN opinions about it. Any pitfalls?

https://matthewdowney.github.io/encrypting-keys-in-clojure-a...

1. Generate a new set of API keys.

2. Read my encrypted map of keys from disk, decrypt it with a passphrase, assoc in the new key & secret, encrypt it again, and write it to disk.

3. At the entry point for my application, use (.readPassword (System/console)) to securely read in the passphrase, and then use it to decrypt the key file and read it into a Clojure map.

4. Instead of passing the key map around (allowing it to potentially escape into a debug log, or be printed at the REPL if I do something dumb), the top level code of my application passes the credentials into a signer-factory for each api that closes over the credentials.

    ;; The factory is shaped something like this
    (defn request-signer-factory 
      [{:keys [key secret]]
      (fn [request-to-sign]
        (sign-request request-to-sign key secret)))
       
    ;; Then an API endpoint looks like this
    (defn place-order! 
      [signer {:keys [price qty side market post-only?]}]
      (let [request (comment "Format the order data for the exchange")
            signed (singer request)]
        (do-http-request! signed)))
I like this workflow more than others which are centered around only encrypting credentials inside of your Git repository, and decrypting them when you clone / pull, because it means that not even on my development machine are keys just sitting around in plaintext.


From experience, it's good to support multiple ways of configuring an app. Depending on your case, this could become hard or requires naming conventions.

This way you can move secret data to files if needed depending on the deployment choice.

My own rule of thumb is:

1) sensible default value

2) read from file

3) read from env

Examples:

- https://github.com/spf13/viper

- https://docs.spring.io/spring-boot/docs/current/reference/ht...


When I was doing Scala development, we went the same route using TypeSafe Config. Default/file in the src/resources which could be overridden by ENV.

It seems like secrets aren't orchestration agnostic. You can't seem to use Docker secrets without being in swarm mode, and k8s has its own secrets management system (or if you're running on AWS, you can use pod2iam and ssm to store/get encrypted parameters). I've been at places that use k8s+vault as well.


And fix it by making it dependent on something docker specific?


I use and recommend Mozilla SOPS https://github.com/mozilla/sops


I'm surprised at the amount of responses here that hate this article.

It's not without cause/justification. Any stack tracking software (PagerDuty, Rollbar, NewRelic...) gets huge amounts of secrets pumped out to them regularly because of things like this.

And the author is not wrong, the environment doesn't need the secret. The application does.

Sure, you may not like the proposed solutions, but there are plenty out there, he just named 2.


Maybe if all of your stuff is in docker does this article make sense. Otherwise it is just directly wrong. Use ENV variables they work.


So to mitigate the security risk of ENV variables I should run a JavaVM, a Java Server, and a FUSE client? No thanks.


I just did a write up about how we use a secrets manager to load our environments allowing an easy centralised management across multiple projects/envs.

https://news.ycombinator.com/item?id=23822681


If migrating your infrastructure to swarm is not feasible:

- make sure to sanitize the environment before spawning any child processes.

- Be sure to `set +x` (or your shell's equivalent) in your CI process

- that your secrets never get interpolated into a string through your scripting language.


Oddly not talked about much:

In a lot of cases, if an attacker gets even limited access to the application environment, dumping the ENV variables is trivial as they were never intended to be secure.

`process.env`, `printenv` (including php attacks), `ENV`, etc


Is transferring them to memory in your startup routine, then doing unsetenv() a reasonable mitigation?

It seems like it addresses several of the listed concerns. It's not perfect, of course, but perhaps better, and straightforward.


I don't understand why we have to put them into the environment in the first place (and then make sure we scrub it). Isn't it just as easy to read the secret from a file?


The difference could be that files are managed by the devs and env is managed by the ops. Depends on the team structure you have.


There are setups like batch schedulers and lambdas/serverless where using environment variables is pretty standard.


What's a best practice that doesn't use this Docker-specific feature ?


Use a config file not checked into git


This comes back to where does the config file live and how is it managed?


The plain config file must live in memory (tmpfs). The encrypted config file on a hard drive that is accessed only to decrypt the file to memory.


No, I mean management of the secrets. Though it is a rather moot point as the same problems apply to most systems anyway.


Wiki protected pages by role. When roles change so do passwords.


I'm curious why such opinion is unpopular. ENV is too global and easily exposed in accident. I think people just don't want to think to leave 12factor (including me).


> At a previous job I helped solve this problem with a really elegant solution

That distinction might go to something packaged in a single .h and .c file, if it passes additional smell tests.


I mean, if the argument against it is: what happens if I run arbitrary code in your server? The lease of your problems is having secrets in your env...


There is secret manager is AWS and gcp.


env vars are not collected by default within containers. if OP's alternative is to use docker for every service then the whole problem is nonexistent since containers cannot access the host env.


Rails already solved this with credentials files...


The advice is good. But why is ENV capitalized? The term is just "environment variable". One heuristic I use for evaluating technical material is orthography: if you spell or capitalize or spell something improperly, it's likely you'll get a lot else wrong too.

As for secret storage: didn't we solve this problem with keyrings? If I must put a secret in long-term plaintext storage, I might as well put it in a file, where I can see, access-control it, and audit it. Where's the audit log for someone reading an environment variable value?


Maybe he's got a history of working with PHP? $_ENV is an autoglobal array that gets filled with environment variables.

https://www.php.net/manual/en/reserved.variables.environment...


The example they use at the top of the article is in bash/shell, and seeing it written as in the article (as "ENV variables") is kinda like a code smell, indicating the author isn't really familiar with what they're criticizing. I think that's what GP is getting at, and it threw me as well - the article really only makes sense when looking at it from the perspective of the author not knowing how env vars are scoped.


It's very common to write ENV in all caps. You can be a bona fide master in UNIX and write it that way (just like some people write UNIX and others Unix).

Sorry, but technical knowledge/experience is not about particular spelling conventions and word proofing, anymore than they are about wearing "business" clothes...


All caps is part of it sure, but that's not the only thing going on. Given what "env" means, either "environment variables" or "env vars" is more normal. Mixing one short and one long is an oddity on top of the capitalization.

Like I said though, it's along the lines of a code smell. It's definitely possible the writer is simply doing it by reflex due to influence from some other language (but then, why not use that language in the example at the top?), but it sticks out and makes more familiar readers question why it's being written that way, where the author is actually coming from.


> Sorry, but technical knowledge/experience is not about particular spelling conventions and word proofing

Yes, actually, it is. Sloppy language indicates a sloppy mind. I absolutely crank down my estimate of someone's technical competence when I see him writing badly and will continue to do so. Once in a while, you do see bad writing coupled with good technical insight --- especially in infosec for some reason --- but it's the exception, not the rule.


>Sloppy language indicates a sloppy mind.

No, that's a 19th century schoolmaster idea.

Many great hackers are messy with grammar and spelling, and even dyslexic.


The 19th century had a lot of good ideas that we've managed to forget.


Saying "ENV," though, isn't writing badly. Or maybe it is, but it's not a sign of sloppiness. It's not like "JAVA," which was never a thing. It shouldn't trigger your brain defect detector.


It's also accessed that way in Ruby. ENV['MY_VAR']


It's also %ENV in Perl, and ENVIRON in AWK.


>One heuristic I use for evaluating technical material is orthography: if you spell or capitalize or spell something improperly, it's likely you'll get a lot else wrong too.

And while the program env is lowercase, the environment is commonly referred to as ENV in technical literature, APIs and so on. If you didn't know that, it's likely you are getting a lot else wrong too...

Plus, the "heuristic" is bogus anyway. Though it might be fine for evaluating candidates for secretarial roles though...


> the environment is commonly referred to as ENV in technical literature

No, it isn't.


It is. This is how I would've written it.


Perhaps the author wrote the article on their MAC.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: