Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Embedded malware in RC (NPM package) (github.com/advisories)
140 points by hjek on Nov 5, 2021 | hide | past | favorite | 114 comments


See also the ongoing discussion about malware in "Coa", another NPM package: https://news.ycombinator.com/item?id=29116878


Also recent:

NPM package ‘ua-parser-JS’ with more than 7M weekly download is compromised - https://news.ycombinator.com/item?id=28962168 - Oct 2021 (141 comments)


And yet again, twice in a row this time.

Note how the referenced Virustotal result has 40+ detections [1]. I'm still wondering why info like this isn't used by Pypi and NPM. Chocolatey has Virustotal integration for all releases.

And it's not like Virustotal is the only option, there is Cape [2] for dynamic execution, Metadefender, and Intezer Analyze just to name a few.

Really confusing for such a vital supply chain component to be this easily abused.

One of the highlights is when someone recently used NPM to spread ransomware via a fake Roblox API package.[3]

[1] https://www.virustotal.com/gui/file/26451f7f6fe297adf6738295...

[2] https://github.com/kevoreilly/CAPEv2

[3] https://www.reddit.com/r/programming/comments/qgz0em/fake_np...


> Note how the referenced Virustotal result has 40+ detections. I'm still wondering why info like this isn't used by Pypi and NPM.

I was contracted to help build a malware analysis pipeline for PyPI[1][2]. We don't currently have a VirusTotal detector/analyzer (IIRC, we couldn't get a high-enough volume API token on short order), but I think any work towards that would be greatly appreciated by both the PyPA members and the Python packaging community!

[1]: https://pyfound.blogspot.com/2018/12/upcoming-pypi-improveme...

[2]: https://github.com/pypa/warehouse/tree/main/warehouse/malwar...


It’s not clear that this would be useful; at least for the coa package, the DLL was downloaded dynamically via a script, so NPM would not have been able to detect it unless the script itself was flagged. Not sure what Chocolatey does, but it’s also hard to threshold on VirusTotal when there are a lot of FPs by random vendors.


We're building a solution to solve exactly this at phylum. I'm not trying to be a sales shill, but if anyone is interested in discussing ideas on how to best defend open-source libraries from these types of attacks, please get in touch, I'd love to hear from you!

We're consuming everything we can about a package to figure this out. We've built a static analysis system to reason about code (it's not perfect, but we're getting better and better). We process all the data we can get, then build analytics, heuristics and ML models to extract evidence. The evidence is then pieced together to identify software supply chain risk.

In this case there is a lot of signal to show both bad and suspicious things are happening.

1. Obfuscation: this creates a comparatively deep AST of the code, and isn't difficult to identify.

2. Command execution: curl, wget, LOLBINs like certutil are pretty easy to identify. This isn't a slam dunk every time you see it, but it adds evidence to a potentially malicious claim.

3. URLs: These are uncommon in libraries and add evidence.

4. Pre/Post install scripts: These are fairly commonly used for other things as well, but invoking node on a source file that is likely obfuscated is a good sign something suspicious is happening.

We're trying to build everything fast enough to make the target far less attractive for attackers before it gets a lot worse.


Given that these attacks are becoming increasingly common, package registries could at least install each package (prior to publishing) in some isolated container or VM and then run some similar malware detection on the resulting file system.

Honestly, I'm strongly considering moving away from the NPM ecosystem because it's clearly become a target for malware.


But attackers are not dumb. They would circumvent whether loose checks the package manager may have. Just considering your suggestion, the obvious immediate exploit is to not deploy the attack payload right away. Nothing you will think of will evade defeat.


I agree that it is an unending arms race, but if NPM doesn't even plug obvious holes (like running install scripts by default), then they've lost my trust.

Edit: if anyone knows of a way to disable NPM from running install scripts automatically (without having to remember to specify --ignore-scripts on each invocation), while still allowing me to use "npm run" to manually run scripts (e.g. test scripts for my own packages), I'd love to hear about it.


npm ci --ignore-scripts


Ah, sorry, I meant a way to automatically do that (both so I don't have to type as much and so that I can't accidentally forget to add that argument). Edited my comment to reflect this.


You can run this

  npm config set ignore-scripts true
which will update ~/.npmrc (you can also create project-specific .npmrc files if you prefer)

You can see how NPM has been configured by running

  npm config list


That breaks “npm run” (violating the second part of my request).


The manpage says that directly invoking a script from package.json with `npm run X` aka `npm run-script X` will still work with this preference set to true, but that it will not run pre/post scripts. Perhaps that is good enough for your use-case?

  npm help run-script
  
   [...]
   ignore-scripts
       o Default: false
       o Type: Boolean
       If true, npm does not run scripts specified in package.json files.

       Note  that  commands  explicitly  intended to run a particular script, such as npm
       start, npm stop, npm restart, npm test, and npm run-script will  still  run  their
       intended  script  if  ignore-scripts  is  set,  but  they will not run any pre- or
       post-scripts.


Edit: That doesn't work as described on NPM 6.14.14 (the version I had), but it does work on NPM 8.1.0 (the version I just upgraded to). Thanks!


Ah, looks like it was introduced in v7.0 - https://github.com/npm/rfcs/pull/185


> Nothing you will think of will evade defeat.

Trust. Why is it that random people can submit packages? Make it so they can't. Only trustworthy people should be able to do that. People who care, so that we don't have to. People we can trust. This is how Linux distributions work and you just don't see malware randomly making its way into official repositories.


You can DIY. There's also plenty of reputation dependency scanners out there (especially in the corporate world) that will look at license, commit rate, number of committers, release frequency, transitive dependencies, etc and generate "safety" score for you

E.g. "This is maintained by a huge network of contributors who contribute to other huge projects" vs "This is a single developer with a couple commits a year"


I wish it worked that way. I just peeked into how python packages in debian-based distros work. They are most frequently PyPI packages with some debian wrapping, so we're back at the same problem.


PyPI allowed me to make an account and just push packages there like it was nothing. Great for me, not so great for users.

These Debian wrappers, however minimal, imply the existence of a maintainer trusted by the Debian community. It's assumed that this maintainer has read the source code and determined it is safe.


Or at least pinned a version that is known good


“Don’t let perfect be the enemy of good”


I've seen some talks about implementing this at the programming level but can't remember the specifics. Basically treating dependencies similar to apps on a smartphone where they each run in a namespace or security context and there's control over what data gets passed in and out of the module or package. (In stark contrast to the current model where everything just runs in a global namespace)


We are working on this here [1].

Uses the object capability model provided by SES [2].

[1] https://github.com/LavaMoat/LavaMoat [2] https://github.com/endojs/endo


Why would malware authors target NPM in particular? Maybe they are targeting lots of package registries, and NPM is just more vigilant?


Chocolatey requires scans on _most_ uploads to the service. Under a certain threshold it only gives you a warning that says "this package had x detections".

Here's FAR manager which for some reason has some hits on virustotal

https://community.chocolatey.org/packages/Far


I think Chocolatey has manual screens when there are more than 5 detections, but not entirely sure


I think that volunteers (some of them maybe paid) should check the validity of code, at least for projects over 10-100k downloads. In case of crates.io (Rust), there is cargo-crev[1]. Also, npm should popularize 2FA.

[1]https://web.crev.dev/rust-reviews/


Say, like package maintainers do for major Linux distributions ?


If you're interested in preventing this sort of thing, I'd appreciate comments on this [RFC](https://github.com/npm/rfcs/pull/488) I just submitted to npm to make install scripts opt-in instead of default behavior. While of course not perfect, this simple change would certainly go a long way in increasing the difficulty in creating these sorts of attacks, as right now as long as a computer even installs the packages in question, not even running any code in the package, the malicious program has a chance at running.

RFC: https://github.com/npm/rfcs/pull/488

Related HN post: https://news.ycombinator.com/item?id=29122473


Your comments on the RFC cover this, but I want to highlight a two things:

1) The vast majority of packages on npm don't require install scripts to work.

2) Many of them that currently include install scripts are just ads asking people to contribute code or money to their project.


Isn't that only useful for the very rare minority of people that install packages then don't run the code right after? :)


The fact that they don't need to be run is what makes them so hard to spot and easy to sneak in. This allows for some very hard to mitigate strategies. It's easier to sneak in packages into metadata files like package.jsons than create "excuses" to import files that aren't really needed. It can be abused in more complicated ways as well, like for example:

1. You add a totally "safe" dependency that you control, let's call it "shell-dependency", as an innocuous part of a PR to a "popular-package". Again, even if you inspect this package, it's totally fine. The current version of shell-dependency is 1.0.0, but it of course goes into the package.json as "1.x.x"

2. You now add malicious dependency to shell-dependency, and bump shell-dependency to 1.0.1, meaning every consumer of "popular-package" now gets your "malicious-package".

Notice that this was accomplished with zero traceable GitHub history. Unless every package up the line uses a package-lock.json (which is explicitly recommended against unless you are an end-user application), "malicious-package" is able to enter the dependency chain undetected. If it required some sort of code import, then it would have more opportunities to be spotted. There are of course ways to do this with attacks that require running code as well, but this makes it super easy, especially considering that people often install packages as root, even when they run their apps not as root.


I understand the attack but my point is that the next step after installing the popular package is probably to import/require it in my code, then run my unit tests, which will allow the evil package to run whatever code they intended to run.


Interestingly enough, this actually isn't always the case! For example, install scripts provide the unique ability to affect build machines, which often don't run the code, but do run install scripts. Build machines can still have sensitive keys though, and will often automatically start building in response to a PR being submitted. So in these cases, the attack targets a part of the system that is inaccessible to attacks that require importing or actually running code, and only requires submitting a pull request, not even having it accepted!

Another example is when npm is used for front-end code (which is a lot!). It's true here that the code will eventually be imported or required, but in the browser sandbox. Install scripts in this instance do meaningfully change the attack surface to not just be whatever cookies you have set up or whatever, but the entire contents of your filesystem.


Yes I answered a sibling comment on front end code, I'll ask here again: doesn't browser code have tests that are run with nodejs?

And my build / CI machines generally run the tests too (though it's possible to separate the test runners from the build runners, I would not bet it's a very common thing).


1. With regard to build / CI, I certainly don't know the percentages worldwide, but this example actually comes from my own experience as a company I used to work at I believe had this setup of separate build and CI machines (for a variety of reasons, including the resource requirements being quite different).

2. At least with our frontend tests, no, we don't run them "in" node-js. Well, to clarify, the "tests" are run in node-js, but the frontend code in question is run in Puppeteer. Hence those libraries do not usually have any chance of touching our filesystem (except for their install scripts). This is a less complicated answer for non-"isomorphic" companies -- for example if your backend is Ruby and you use puppeteer-ruby, then I think we can agree that your npm packages should never run on your machine outside of a browser.

Just to give you a third case, again from my own experience, install scripts are also a good vector for typo attacks. If you type `npm install lodsah` instead of `npm install lodash`, and you don't notice and hit control-C fast enough, that mere installation can be sufficient to compromise your system.


Just an extra thought: as soon as you are running anything from an npm "script" it's also lost, because any package is free to add anything in node_modules/.bin which then gets added to the PATH of those npm "script". So I guess you will want to fix that next :)


Many packages on npm are not run in a local node environment, they run in a browser which is sandboxed.

Regardless, just because there are additional vectors to exploit the user doesn't mean closing off one vector isn't worth doing.


Yeah I also thought of that but aren't front end devs running some kind of unit test in node? Do they always run everything in their browser? (Honest question, I'm a BE person)


I initially had the same thought, but I came to a different conclusion because the presence of an install script is not a given for most packages.

For example, if I installed a command line arguments parser and it claimed to require running a setup script, I would immediately be very suspicious.

For a huge package like TypeScript, I'd probably just immediately let the script run and trust Microsoft to not publish malware (and NPM to not change the package contents).


Non-tech solution is to pin versions and read (hacker)news before updating.


Problem is, as someone mentioned in the GitHub thread as well, people just press OK, especially if it's a transitive dependency.


I've answered this on GitHub, but will provide an answer here too. Packages aren't like end-user software, for two important reasons:

1. The people involved are developers using development components, who are much better equipped to understand an installation failure and take proper action than an end user who is just trying to click through to play a video game or something. But more importantly:

2. The vast majority of the installs happen on automated machines (like CI), where you definitely want to fail when something drastically different happens like a new package is all of a sudden running a script. The tests would fail, you would look at the reason, it would be because some weird script is about to run on the machine, and you'd adjust your PR accordingly. This allows multiple levels of consideration: 1) the original author deciding to do something about the failed install (even if it's just appending the flag and not thinking about it), and 2) the PR reviewer having code-as-data evidence that this code change would mean new foreign code not represented in the commits will be running on their machines. This is huge.

The other important point about this is that packages precisely have different risk models depending on whether you are installing things locally or on production machines, where a malicious package could be catastrophic. That's why the RFC allows you to have individual user configuration where you explicitly allow certain packages "from now on" (which is I think the way most people want to think about this: the first time you install something and it warns you and you look into it, but from then on you say "this one is good"). On the other hand, the actual scripts and repository have no such configuration and thus require the installation to include the specific flags, again, clearly documenting the expected results of a seemingly innocent process that can actually currently have bad consequences.


I don't think anyone is claiming making install script opt-in will fix all the problems. But adding any friction discourages the behavior. And if the norm changes, it's reasonable to believe package maintainers will start opting for dependencies that don't require install scripts. Thus further encouraging people to not include them with their packages.


One of the thing I wish was really much easier to do with NPM is, when running `npm update`, to only pick up the most recent compatible versions from X days ago.

That is, for sensitive apps, I don't want to use versions that are less than, say, a month or so old unless I specifically override it. I want to stay up-to-date but not too bleeding edge, specifically to avoid situations like this.


That would be a big improvement, but I assume that people would override this check whenever they were updating a package to fix a minor accidental vulnerability that had been found in it (detected by an "npm audit").

An attacker who had control over an account would wait until just such a moment to add their own version which includes a much worse payload, and people would rush to download it, thinking they were just installing a fix for the minor vulnerability.


renovate bot can do this. set it to wait for x days before accepting a merge, and pin your versions so there's no chance to update by mistake.


This is why JS runtimes should add the ability to set permissions on a per-module basis. Deno is a step in the right direction by requiring permissions for a script to be specified (e.g. deno run --allow-read --allow-net myscript.ts), but the permissions are global for the entire script and can't (yet?) be configured differently for each module / dependency.


Alternatively, JS programmers should exhibit less contempt for the standardized, sandboxed runtime that JS was originally created to target: the Web browser.

What's nuts is that any of these projects (whether they be single components, larger utilities, or full-blown apps) require a build step that involves anything more complicated[1] than a single machine-readable document in the web browser's native file format and that sits alongside (or in place of[2]) the project README. With so many programmers writing code for the express purpose of making digital documents with behavior dynamic enough to trick you into thinking that the page you have open is really an app, no one in the community with any clout ever stops and says, "Gee, since we're at it, maybe we ought to take this tech that enables us to securely run code on demand and focus it on the goal of allowing other programmers to configure all these modules that we're sharing with one another, or to handle the finishing step of a collection of modules that make up a given app."

Then again, that would presume that any of the stuff that this industry engages in is actually meant to solve any problem, rather than creating a neverending supply of them in order to justify the paychecks being written and the egos they're feeding. If stuff's not laughably overengineered to the point of constantly breaking for no good reason[3], does it even count as "real"[4] programming?

1. https://www.colbyrussell.com/2019/03/06/how-to-displace-java...

2. https://news.ycombinator.com/item?id=28407936

3. https://news.ycombinator.com/item?id=24494434

4. https://news.ycombinator.com/item?id=17165784


JS programmers don't have contempt for browsers, it's just that server runtimes like Node and Deno serve a different purpose. A browser isn't the right tool for running a server application or a command line program. If you want a server runtime that behaves more like a browser, though, check out Deno.


The runtime environment of the application itself is a separate matter from the environment needed during the fetch-and-configure phase for the modules that the application depends on.

This malware was lurking in an auto-executing script that runs upon merely trying to fetch the module. Under the regime that I outlined, malicious actors would be forced to try doing their dirty work within the module's business logic itself, which is far easier to mitigate—for example, with a policy where no code gets merged into the application without undergoing review, whether it is written by a third-party or someone on your team. In spite of all the craziness that the 2010s led to with the rise of specialized package registries trying to recreate CPAN (often erroneously called "language package managers") and giving the illusion that there is such a thing as a free lunch, we're going to have to eventually deal with reality and accept that this is the only reasonable way to approach software development for the stuff that we need to rely on (and that no amount of trying to sweep the problem under the rug with references to Trusting Trust will make the counterargument a sound one).


Genuinely curious: why is malware always discovered in npm packages, and not pip (python), gradle/maven (JVM), cabal (Haskell), cargo (Rust), CRAN (R), etc.? Or are there major vulnerable packages in those repos but they just don't get audited?


It's an attractive target. Other ecosystems (maybe besides Rust) rely on large packages with minimal dependencies, and those packages are often first-party (Entity Framework, for example).

NPM meanwhile is a neverending net of tiny oneliner packages, required by other oneliner packages, required by twoliner packages, required by single-function packages, required by... required by React. And thus, adding malware to `is-number` adds it to all 8766235452 packages that depend on it.


I thought someone happened in pip the other year.

Part of the NPM issue is that everything gets atomized into tiny libraries and no one seems to care about the dependency explosion.


The older such a system is, the more there is a "don't trust anyone" mentality. Some decades in the past you did not just run alien code in your system.

Edit: without looking at it beforehand


There definitely is malware around on PyPI, and typosquatting.



Thanks.

> Since then, the npm security team has removed all the compromised coa and rc versions to prevent developers from accidentally infecting themselves.

Removing all trace of evidence is not something "security teams" should do. Instead of sweeping security incidents under the rug (where twitterverse resides), they should at least mention the existence of these versions and that they contain malware on the package page.


They posted the [version diffs] (https://my.diffend.io/npm/coa/2.0.2/2.0.4)in the article. Not sure if this site catches all the changes though.


Every time I see news like this, I am amazed by the absolute lack of permission management in nodejs and npm.

I mean, a package.json with changing permissions and an alert or manual confirmation step could've easily fixed this.

NPM is pretty much the definition of a security nightmare, because you cannot guarantee anything.

Any dependency down the tree can compromise anything upstream.

I think that package managers must offer build bots that use the source codes (git repositories) as sources of truth rather than their own packages. That's the only way that comes to mind to guarantee that the publisher of the package is actually the same owner.

If a git repo changes, warn all users. If a permission changes, warn all users. If a header/symbol file changes, warn all users.


Using npm is like russian roulette. Someday it makes your head hurt really bad!


I checked the readme of both those packages and I can't for the life of me understand why would anyone use either of them.

Why the fuck do all these leftpad is-even hello-world tic-tac-toe packages have millions of downloads?


Command line argument parsing and config loading both seem like very sensible library abstractions to me. This isn’t leftpad.


Command line argument parsing and config loading both seem like something that the standard library should provide.


Ok, now, what languages beside Python and Go provide Command line argument parsing? And Go doesn't do that in a `professional` way. You either write your own, which can easily turn into a clusterfuck or use a third party library. Even in Go, people use cobra[1]. Also embedding a lot of functionality in a standard library isn't great as well, because if some vulnerability is found, it's really hard to patch it, because you need to push versions and (for example on Linux) some distro maintainers won't push it for `stability` etc. A standard library should provide basic functionality (in most general areas), but not very advanced one.

[1] https://github.com/spf13/cobra


> what languages beside Python and Go provide Command line argument parsing?

Even POSIX gives you getopt(1) and getopt(3). What other language doesn't? I can only think of Java.


POSIX is not ISO C or C++ standard. On Windows, what are you gonna do?

Also, other languages are Rust, Kotlin, Swift (to name a few `modern` ones). Yes, Kotlin and Swift have `first class` CLI parsing libraries, but they are not part of standard library.


"First-party" is distinct from "first class". The difference between a first-party library and the standard library ranges from "slightly weaker compatibility guarantees" to "it's supported in all environments where it makes sense, but the language can run unhosted so that's not everywhere" to "no difference at all, we just didn't want to package it with the compiler".

You're also missing the forest for the "well actually" trees: Lots of languages have argument parsing in their stdlib.


Sure, and Node gives you the process.argv array. The point is having higher level APIs than that.


it feels like the more higher level APIs we add, the shittier and more annoying software becomes


getopt is a higher-level API. process.argv is equivalent to, uh, argv.


It's not part of the standard library, but Swift has the first-party ArgumentParser[0]. Other languages could use a similar model (though what "first party" means for JavaScript is unclear).

[0]: https://github.com/apple/swift-argument-parser


as if there even was one command-line or config standard. especially across different operating systems.

it absolutely does not belong in stdlibs, where it can never be changed. that's how you end up with too many terrible CLI tools using Go's `flags` package.


I don't understand, what's wrong with Go's flag, compared to, say, python's argparse?


php


>Command line argument parsing

20-30 LoC maybe. `process.argv.slice(2).forEach(str => ... )`.

there is no access to the raw command-line invocation, sadly, so you really can't really do anything fancier than that.

>config loading

that thing "RC" package does - looking up the config file in random locations - is really strange to me. aren't you the one in control of where it is stored?


Just looking at the readme for Coa it’s very obviously more than the code you outlined. You’re arguing against a strawman here.


honestly, yeah. I just saw "command line parser" and dismissed the rest as useless bloat


Chained dependencies? If you can fool one popular package to depend on you, you ride their coattails.

And perhaps some faked download numbers to lend an air of authenticity.


Maybe we need to hold the popular packages accountable for stuff like this.


Before we go mobbing innocent open source devs on Twitter, it'd be great to know how far NPM has progressed on 2FA. Up until 2018 NPM didn't have 2FA at all. They just introduced it. It'd be nice if they could give some kind of progress report on how widely adopted it's become. Ideally it should be required for publishing packages. Or at the very least, it'd be great to have some transparency about which package authors are actually using it, who aren't, and who's delegated their authorization to some other vendor like Travis -- so we as users can make our own informed choices about risk. It'd also be useful to have charts that log dependency gravity over time since an important question in situations like this is: did RC and Coa go from 70k to 17m users yesterday? Or have they been established for a long time?


This is like the third one this week right?

I know people keep saying about post-install should be opt out but then malware will just wait for first run instead.

How about an option to refuse to install any packages that have been published in the past week/2 weeks? That way hopefully malware like this would have been spotted before you end up running it locally.


If everybody waits two weeks, then nobody will notice it on the first two weeks.


The registry could send an email to the account which uploaded the package saying "Thank you for uploading version 1.2.3 of your-package." which would give them two weeks to think "Wait, I didn't upload a new version."

Of course, if the attacker has access to their npm account, they can probably change the email address associated with the account too, so change-of-email requests should send a "Thank you for changing your email address" to the original address.

The developer might have difficulty regaining control of the account, but hopefully they could inform the npm security team who could quite quickly confirm that a malicious package had been uploaded, which would be enough to get the malicious package taken down and the account locked.


They could also attack metadata parsers next - I don't think those are very hardened right now.


Is the advisory genuine?

It links to the github repo, where the latest commit is from 2018 for version 1.2.8.

It links to npmjs page, that shows 48 versions, where the latest version is 1.2.8 from "3 years ago".

Yet it has 1.2.9/1.3.9/2.3.9 for "Affected versions".

Did npmjs "revert" these versions and any clue of their existence? The npmjs page links to dominictarr's repository. The npmjs site doesn't seem to have a "who owns this package name" besides the repository/homepage links. Very confusing.

I remember some years ago there was some story involving the original author's handing maintainership rights to some shady dude. Is it about that time, or is it about something more current?


Practically all package managers (NuGet, crates.io, NPM, etc...) are decoupled from the source code. What you download is NOT necessarily what's in GitHub.

I pointed this flaw out repeatedly in the Rust forums when there were discussions related to improvements that could be to crates.io. They made it very clear that "everyone understands" that crates themselves are the "source of truth", and that nobody should be doing security reviews by going to GitHub or wherever.

So what happens in reality?

Precisely what you just did. People instinctively click the source repo link, and browse around in the GitHub history view to "see what happened".

Sigh.

It's like trying to explain to someone that the cargo lift design of the Death Star is dangerous without handrails. Then someone points out that when you signed up to be a Stormtrooper on the Death Star there was a clause in the contract (page 537 paragraph 7) that clearly states that it is your responsibility to avoid fatal falls due to precipitous ledges.


I think I remember discussing this briefly in #rust with you. It's clearly not the case that "everyone understands" how these package managers actually work, but I'd rather see the reality become more obvious than give up and shackle these package ecosystems to Microsoft even harder than they already are.


NPM, Cargo, and the like are basically like the people saving a link to a random Wikipedia article, and then making the shocked Pikachu face when their presentation in front of the boss shows a defaced article with the Goatse picture in the middle of it.

There's a solution to this problem, of course. The Wikipedia team provides the tools you need! You can link to a specific revision of an article so there are no surprises. What you saw when you reviewed the content is what you get when you project it in the board room, or send out that mass email that includes your boss.

Similarly the solution for crates.io could be as simple as having hyperlinks go only to specific commit hashes. And then require that the crate content match the hash.

These days I hear a lot of developers complain that they "Just want to...". I always complete the sentence with "... ignore my responsibilities."

Package managers are in the same camp. "I just want to distribute packages.". Okay, sure, but your responsibility is to do it so that downstream consumers fall into the pit of success and aren't burned by supply-chain attacks.

You can argue, or you can start working on catching up to the encyclopedia people that came from a background in porn hosting and start taking security seriously.


> require that the crate content match the [commit] hash.

If you want to audit a crate, you don't need to require that it "matches" anything else, you can just audit the crate. Download the source tarball from the same URL that Cargo would and audit it. I think the problem with crates.io is that it just gives you the GitHub link (possibly misleading - bad!) and doesn't just give you a button to download or browse the tarball that Cargo actually uses (what you see is what you get).

Defining "matches" and enforcing it against a remote Git repository is non-trivial. It gets worse with NPM because those packages are sometimes the output of the Typescript compiler or Webpack, so now you need reproducible builds (a huge task) and a CI infrastructure to validate them. Nuget distributes .dll files, which often aren't even open source. There's no hope to enforce a correspondence with a Git repository there. A developer who wants to audit a Nuget package has no choice but to decompile it.

NPM clearly has a malware problem, and Cargo will eventually have one because it really wants to be like NPM. I'm not convinced that what you propose is the solution.


I get it. My mistake. Page 537 paragraph 7. I must have just missed it on the first reading of the contract.

I won't make that particular mistake again.

But just like the thousands (and thousands!) of people that are befuddled as to why Rust's console output is slower than Python, it's a pit of failure that others fill fall into.

Over and over. And over.


I saw in one of the repos the maintainer confused about what happened, seemly someone somehow impersonated him and released new versions to npm without actually touching the repo itself!


Would it be better for package managers to default to staying at a fixed version? I know npm defaults to semver upgrades. You say

    npm install foo@3.1.7
And it, by default, inserts "foo@^3.1.7" which means "anything 3.1.7 or higher but not "4.x.x".

In other words, the next time someone installs the dependencies it could be 3.1.8, 3.9.7, 3.1234.999 etc...

But maybe it should default to just the actual version and all upgrades should be required to be manual. Checking my HD I see I have lots of references to "rc@^1.1.6", "rc@^1.2.8" etc, all of which would install 1.2.9 if reinstall the deps


I've created Vouch in an attempt to address this problem:

https://github.com/vouch-dev/vouch

Vouch lets users create and share reviews for NPM packages. Project dependencies can then be checked against those reviews.

Vouch uses extensions to interface with package ecosystems. It's simple to create a new extension. Extensions currently exist for NPM, PyPi, and Ansible Galaxy.

I'm currently working on a website to index known reviews and publish official reviews.

I hope you guys find it useful! Drop by the Matrix channel if you have any feedback to share: #vouch:matrix.org


That's not really the solution for this problem, though, which is very specifically when a project maintainer's account gets compromised, so then the bad guys publish a new malicious version of that library that gets picked up by anyone using non-pinned NPM versions (i.e. most everyone).

There are a couple more straightforward ways to do this:

1. Require 2FA, ideally hardware key 2FA, for anyone publishing a package with any sizable following.

2. Make running of preinstall/install scripts opt-in.

3. Make the semantic versioning syntax optionally more restrictive. If I specify I want version ^2.2.1, I'd like to be able to specify that I DON'T want to pull 2.2.2 the moment it becomes available, but perhaps want some amount of latency before pulling that.


Reviews in Vouch refer to a particular version of a software package. If a new release is issued by a malicious actor, the new release would require a new review.

But the review process does not need to re-start from scratch. Reviews from other versions can be used to lessen the workload.

On the subject of automatically updating packages: the Vouch dependency analysis can be included in CI. Un-reviewed or review failing dependency updates can be flagged for attention.


Who's reviewing the software package's dependencies?


Each dependency of a software package would have its own separate set of reviews.

Anyone can produce a review using Vouch. Official reviews will also be published in the future.


So every version of every dependency of every package needs a review? It only takes one version of one dependency of any software package to be compromised by supply chain.


Each developer may choose to minimize the software dependency attack surface to a different degree.

Perhaps they would trust a package published by Google without a review. But would require a review before using a package from an indie developer.

Incremental decreses in the attck surface are valuable.


2. Make running of preinstall/install scripts opt-in. They already are, npm install --ignore-scripts


What you've written there is literally the definition of opt out.


I'd like this to work, but it seems like it relies on packages being statically "good" or "bad" - what happens if a package is legitimately well trusted but then the main dev gets backdoored and a bit of extra code is injected?


That extra code will stand out as not having been reviewed.


Oh, you mean to have reviewers look at every line of every version (at least cumulatively). Yes, that would work, and while the effort is significant I appreciate that it's the effort you'd want regardless so this helps share the load.


That's right! The purpose of Vouch is to continually decrease the cost of the review process with each new review.


Sounds really interesting. I'm kinda scared to use npm right now to be honest. What dangers may be hidden deep in the dependency trees? But is a review of every update then done, because rc used to be a legit package?


A review corresponds to a particular version number. But the review process does not need to start from scratch with each version number increment. Reviews from previous versions can be leveraged to lessen the workload.


> The check command generates an evaluation report of local project dependencies based on available reviews:

Is there an example of a generated report?


Actually had this package installed somewhere...

  "version": "1.2.8",
Pfew, really lucky. Going to nuke npm now.


Seems like a good choice to work at a cybersecurity company these days. Job security is guaranteed.


Yea, but you'll end up cursing everything about computing because everyone and their dog starts their day with another `curl | sudo bash` pipe that installs something all across the system pulling in a metric shitton of random packages and you can't even imagine why all this nonsense is needed for a hello world app.


How long were the compromised packages available from npm?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: