And yet again, twice in a row this time. Note how the referenced Virustotal resu...

woodruffw · on Nov 5, 2021

> Note how the referenced Virustotal result has 40+ detections. I'm still wondering why info like this isn't used by Pypi and NPM.

I was contracted to help build a malware analysis pipeline for PyPI[1][2]. We don't currently have a VirusTotal detector/analyzer (IIRC, we couldn't get a high-enough volume API token on short order), but I think any work towards that would be greatly appreciated by both the PyPA members and the Python packaging community!

[1]: https://pyfound.blogspot.com/2018/12/upcoming-pypi-improveme...

[2]: https://github.com/pypa/warehouse/tree/main/warehouse/malwar...

iancarroll · on Nov 5, 2021

It’s not clear that this would be useful; at least for the coa package, the DLL was downloaded dynamically via a script, so NPM would not have been able to detect it unless the script itself was flagged. Not sure what Chocolatey does, but it’s also hard to threshold on VirusTotal when there are a lot of FPs by random vendors.

peteatphylum · on Nov 6, 2021

We're building a solution to solve exactly this at phylum. I'm not trying to be a sales shill, but if anyone is interested in discussing ideas on how to best defend open-source libraries from these types of attacks, please get in touch, I'd love to hear from you!

We're consuming everything we can about a package to figure this out. We've built a static analysis system to reason about code (it's not perfect, but we're getting better and better). We process all the data we can get, then build analytics, heuristics and ML models to extract evidence. The evidence is then pieced together to identify software supply chain risk.

In this case there is a lot of signal to show both bad and suspicious things are happening.

1. Obfuscation: this creates a comparatively deep AST of the code, and isn't difficult to identify.

2. Command execution: curl, wget, LOLBINs like certutil are pretty easy to identify. This isn't a slam dunk every time you see it, but it adds evidence to a potentially malicious claim.

3. URLs: These are uncommon in libraries and add evidence.

4. Pre/Post install scripts: These are fairly commonly used for other things as well, but invoking node on a source file that is likely obfuscated is a good sign something suspicious is happening.

We're trying to build everything fast enough to make the target far less attractive for attackers before it gets a lot worse.

schemescape · on Nov 5, 2021

Given that these attacks are becoming increasingly common, package registries could at least install each package (prior to publishing) in some isolated container or VM and then run some similar malware detection on the resulting file system.

Honestly, I'm strongly considering moving away from the NPM ecosystem because it's clearly become a target for malware.

thrashh · on Nov 5, 2021

But attackers are not dumb. They would circumvent whether loose checks the package manager may have. Just considering your suggestion, the obvious immediate exploit is to not deploy the attack payload right away. Nothing you will think of will evade defeat.

schemescape · on Nov 5, 2021

I agree that it is an unending arms race, but if NPM doesn't even plug obvious holes (like running install scripts by default), then they've lost my trust.

Edit: if anyone knows of a way to disable NPM from running install scripts automatically (without having to remember to specify --ignore-scripts on each invocation), while still allowing me to use "npm run" to manually run scripts (e.g. test scripts for my own packages), I'd love to hear about it.

Ginden · on Nov 5, 2021

npm ci --ignore-scripts

schemescape · on Nov 5, 2021

Ah, sorry, I meant a way to automatically do that (both so I don't have to type as much and so that I can't accidentally forget to add that argument). Edited my comment to reflect this.

jffry · on Nov 5, 2021

You can run this

  npm config set ignore-scripts true

which will update ~/.npmrc (you can also create project-specific .npmrc files if you prefer)

You can see how NPM has been configured by running

  npm config list

schemescape · on Nov 5, 2021

That breaks “npm run” (violating the second part of my request).

jffry · on Nov 5, 2021

The manpage says that directly invoking a script from package.json with `npm run X` aka `npm run-script X` will still work with this preference set to true, but that it will not run pre/post scripts. Perhaps that is good enough for your use-case?

  npm help run-script
  
   [...]
   ignore-scripts
       o Default: false
       o Type: Boolean
       If true, npm does not run scripts specified in package.json files.

       Note  that  commands  explicitly  intended to run a particular script, such as npm
       start, npm stop, npm restart, npm test, and npm run-script will  still  run  their
       intended  script  if  ignore-scripts  is  set,  but  they will not run any pre- or
       post-scripts.

schemescape · on Nov 6, 2021

Edit: That doesn't work as described on NPM 6.14.14 (the version I had), but it does work on NPM 8.1.0 (the version I just upgraded to). Thanks!

jffry · on Nov 6, 2021

Ah, looks like it was introduced in v7.0 - https://github.com/npm/rfcs/pull/185

matheusmoreira · on Nov 5, 2021

> Nothing you will think of will evade defeat.

Trust. Why is it that random people can submit packages? Make it so they can't. Only trustworthy people should be able to do that. People who care, so that we don't have to. People we can trust. This is how Linux distributions work and you just don't see malware randomly making its way into official repositories.

nijave · on Nov 6, 2021

You can DIY. There's also plenty of reputation dependency scanners out there (especially in the corporate world) that will look at license, commit rate, number of committers, release frequency, transitive dependencies, etc and generate "safety" score for you

E.g. "This is maintained by a huge network of contributors who contribute to other huge projects" vs "This is a single developer with a couple commits a year"

peteatphylum · on Nov 6, 2021

I wish it worked that way. I just peeked into how python packages in debian-based distros work. They are most frequently PyPI packages with some debian wrapping, so we're back at the same problem.

matheusmoreira · on Nov 6, 2021

PyPI allowed me to make an account and just push packages there like it was nothing. Great for me, not so great for users.

These Debian wrappers, however minimal, imply the existence of a maintainer trusted by the Debian community. It's assumed that this maintainer has read the source code and determined it is safe.

djbusby · on Nov 6, 2021

Or at least pinned a version that is known good

yardstick · on Nov 5, 2021

“Don’t let perfect be the enemy of good”

nijave · on Nov 6, 2021

I've seen some talks about implementing this at the programming level but can't remember the specifics. Basically treating dependencies similar to apps on a smartphone where they each run in a namespace or security context and there's control over what data gets passed in and out of the module or package. (In stark contrast to the current model where everything just runs in a global namespace)

tmpfs · on Nov 6, 2021

We are working on this here [1].

Uses the object capability model provided by SES [2].

[1] https://github.com/LavaMoat/LavaMoat [2] https://github.com/endojs/endo

qPM9l3XJrF · on Nov 6, 2021

Why would malware authors target NPM in particular? Maybe they are targeting lots of package registries, and NPM is just more vigilant?

perth · on Nov 6, 2021

Chocolatey requires scans on _most_ uploads to the service. Under a certain threshold it only gives you a warning that says "this package had x detections".

Here's FAR manager which for some reason has some hits on virustotal

https://community.chocolatey.org/packages/Far

schleck8 · on Nov 5, 2021

I think Chocolatey has manual screens when there are more than 5 detections, but not entirely sure

megumax · on Nov 5, 2021

I think that volunteers (some of them maybe paid) should check the validity of code, at least for projects over 10-100k downloads. In case of crates.io (Rust), there is cargo-crev[1]. Also, npm should popularize 2FA.

[1]https://web.crev.dev/rust-reviews/

m4rtink · on Nov 5, 2021

Say, like package maintainers do for major Linux distributions ?