I really need to dig into the more recent advances in knowledge graphs + LLMs. I...

I really need to dig into the more recent advances in knowledge graphs + LLMs. I've been out of the game for ~10 months now, and am just starting to dig back into things and get my training pipeline working (darn bitrot...)

I had previously trained a llama2 13b model (https://huggingface.co/Tostino/Inkbot-13B-8k-0.2) on a whole bunch of knowledge graph tasks (in addition to a number of other tasks).

Here is an example of the training data for training it how to use knowledge graphs:

easy - https://gist.github.com/Tostino/76c55bdeb1f099fb2bfab00ce144...

medium - https://gist.github.com/Tostino/0460c18024697efc2ac34fe86ecd...

I also trained it on generating KGs from conversations, or articles you have provided. So from the LLM side, it's way more knowledgeable about the connections in the graph than GPT4 is by default.

Here are a couple examples of the trained model actually generating a knowledge graph:

1. https://gist.github.com/Tostino/c3541f3a01d420e771f66c62014e...

2. https://gist.github.com/Tostino/44bbc6a6321df5df23ba5b400a01...

I haven't done any work on integrating those into larger structures, combining the graphs generated from different documents, or using a graph database to augment my use case...all things I am eager to try out, and I am glad there is a bunch more to read on the topic available now.

Anyways, near term plans are to train a llama3 8b, and likely a phi-3 13b version of Inkbot on an improved version of my dataset. Glad to see others as excited as was on this topic!