Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Cool! I've been playing with the same code -> graph concept for LLM work. Why did you decide to go for a pseudo-compiler with a ton of custom rules rather than try to interact with the AST itself?




Hi! Limitations of tree sitter, its insanely fast, easy to use but hits a limit on syntax/nodes only. Typescript compiler provides semantic with full type checking and cross module resolution. Its a small nightmare as I have to write every extraction and parser for it (why i call it "pseudo compiler"). Its a necessity to gain full call chain provenance across callee/caller, framework and validations, which is a "hard" requirement for the taint analysis to work. If you want to get down into code for it? The top layer is ast_parser.py which routes a few places but taking js/ts as an example? look at data_flow.ts / javascript.py which shows the ast/extraction/analyzing layers to capture and make sense of it in the database. :)



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: