More

peteatphylum · on Jan 10, 2023

Another huge fan of just, here. I love that it's installable with a one-liner and I've added this as an option to my dotfile setup.

I'm done trying to cast spells at Make

peteatphylum · on Dec 9, 2022

This attack is particularly interesting. The attackers targeted the massively popular 'requests' package in PyPI, used bitsquatting to target typosquat candidates, and results in ransomware getting deployed.

There be dragons in here

peteatphylum · on Nov 14, 2022

Not only great work in taking the bugs all the way, but a great writeup too

peteatphylum · on Nov 2, 2022

Totally agree. It feels like there is a pretty strong inverse correlation between standard library size, and average depth of a dependency tree for projects in a given language. In our world, that is pretty close to attack surface.

galangalalgol · on Nov 3, 2022

Rust is another example of this. Just bringing in grpc and protobuf gets about a hundred dependencies. Some of them seemingly unrelated. For a language aimed at avoiding security bugs, I find this to be an issue. But a good dependency manager and a small (or optionally absent) stdlib has lead to highly granular dependencies and bringing in giant libs for tiny bits.

peteatphylum · on Nov 2, 2022

We've found a lot of open-source packages that are authored by (well, released by authors identified by) disposable email addresses. We were shocked to find companies doing this, too.

Package Dependency land is a crazy place

remram · on Nov 2, 2022

The reason is obvious, people crawl pypi.org/github.com/npmjs.com and email their job posts or product launches. Every platform that requires an email and shows it publicly will necessarily get a lot of disposable ones.

peteatphylum · on Nov 2, 2022

This is the double-edged sword of open-source. It's awesome because anyone can contribute. It can be dangerous for the same reason, unfortunately.

peteatphylum · on Oct 8, 2022

I think the majority of the functionality in leap can be had in IntelliJ IDE's with the AceJump plugin

peteatphylum · on Oct 5, 2022

(Disclaimer: I work at Phylum, which has a very similar capability)

Not all of it has to be manual. Some vulnerabilities come with enough information to deduce vulnerability reachability with a high degree of confidence with some slightly clever automation.

Not all vulns come with this information, but as time goes on the percentage that do is increasing. I'm very optimistic that automation + a bit of human curation can drastically improve the S/N for open source library vulns.

A nice property of this is: you only have to solve it once per vuln. If you look at the total set of vulns (and temporarily ignore super old C stuff) it's not insurmountable at all.

light24bulbs · on Oct 12, 2022

In what format is that information coming in? There are no function taints in githubs OSV data or NVD's data that I can see.

peteatphylum · on Sept 14, 2022

This is a terrific project. Write a scraper to do this!

I suggest this having read another comment of yours (https://news.ycombinator.com/item?id=31763001) Scrapers are fun projects and there are tons of resources online to assist.

A project like this is a great thing to discuss in an interview as well.

bovem · on Sept 14, 2022

Yeah, I am thinking about doing this.

Some projects that gave me inspiration: - https://pypi.org/project/linkedin-jobs-scraper/ - https://github.com/nicolomantini/LinkedIn-Easy-Apply-Bot

I was thinking to contribute to existing ones to add a database/dashboard option through Google Sheets or something else.

theGeatZhopa · on Sept 14, 2022

I support this strongly. Given that a Software engineer should be able to understand code / write code.. it's a nobrainer to go the scrapping way. Even me, with absolutely no CS background (never studied, but it's my digital life) started to learn python out of fun and also did some scrapping because I'm to lazy in some terms.. Scrap your info.

peteatphylum · on Nov 6, 2021

We're building a solution to solve exactly this at phylum. I'm not trying to be a sales shill, but if anyone is interested in discussing ideas on how to best defend open-source libraries from these types of attacks, please get in touch, I'd love to hear from you!

We're consuming everything we can about a package to figure this out. We've built a static analysis system to reason about code (it's not perfect, but we're getting better and better). We process all the data we can get, then build analytics, heuristics and ML models to extract evidence. The evidence is then pieced together to identify software supply chain risk.

In this case there is a lot of signal to show both bad and suspicious things are happening.

1. Obfuscation: this creates a comparatively deep AST of the code, and isn't difficult to identify.

2. Command execution: curl, wget, LOLBINs like certutil are pretty easy to identify. This isn't a slam dunk every time you see it, but it adds evidence to a potentially malicious claim.

3. URLs: These are uncommon in libraries and add evidence.

4. Pre/Post install scripts: These are fairly commonly used for other things as well, but invoking node on a source file that is likely obfuscated is a good sign something suspicious is happening.

We're trying to build everything fast enough to make the target far less attractive for attackers before it gets a lot worse.