This is actually really cool. I just tried it out using an AI studio API key and was pretty impressed. One issue I noticed was that the output was a little too much "for dummies". Spending paragraphs to explain what an API is through restaurant analogies is a little unnecessary. And then followed up with more paragraphs on what GraphQL is. Every chapter seems to suffer from this. The generated documentation seems more suited for a slightly technical PM moreso than a software engineer. This can probably be mitigated by refining the prompt.
The prompt would also maybe be better if it encouraged variety in diagrams. For somethings, a flow chart would fit better than a sequence diagram (e.g., a durable state machine workflow written using AWS Step Functions).
While the doc generator is a useful example app, the really interesting part is how you used Cursor to start a PocketFlow design doc for you, then you fine-tuned the details of the design doc to describe the PocketFlow execution graph and utilities you wanted the design of the doc-generator to follow…and then you used used Cursor to generate all the code for the doc-generator application.
This really shows off that the simple node graph, shared storage and utilities patterns you have defined in your PocketFlow framework are useful for helping the AI translate your documented design into (mostly) working code.
I had not used gemini before, so spent a fair bit of time yak shaving to get access to the right APIs and set up my Google project. (I have an OpenAPI key but it wasn't clear how to use that service.)
model = os.getenv("GEMINI_MODEL", "gemini-2.5-pro-preview-03-25")
I used the preview model because I got rate limited and the error message suggested it.
I used this on a few projects from my employer:
- https://github.com/prime-framework/prime-mvc a largish open source MVC java framework my company uses. I'm not overly familiar with this, though I've read a lot of code written in this framework.
Woah, this is really neat.
My first step for many new libraries is to clone the repo, launch Claude code, and ask it to write good documentation for me. This would save a lot of steps for me!
The tutorial on requests looks uncanny for being generated with no prior context. The use cases and examples it gives are too specific. It is making up terminology, for concepts that are not mentioned once in the repository, like "functional api" and "hooks checkpoints". There must be thousands of tutorials on requests online that every AI was already trained on. How do we know that it is not using them?
I built browser use. Dayum, the results for our lib are really impressive, you didn’t touch outputs at all?
One problem we have is maintaining the docs with current codebase (code examples break sometimes). Wonder if I could use parts of Pocket to help with that.
This is nice and fun for getting some fast indications on an unknown codebase, but, as others said here and elsewhere, it doesn't replace human-made documentation.
At the top are some neat high-level stuffs, but, below that, it quickly turns into code-written-in-human-language.
I think it should be possible to extract some more useful usage patterns by poking into related unit tests. How to use should be what matters to most tutorial readers.
BLOATED. This project is 100 lines of code, but everything that is non-code related is bloated like a gas giant. All the text and videos are written by an LLM. The author would learn from understanding that QUANTITY isn't QUALITY, toning down the verbiage would benefit greatly what they are trying to communicate.
PS: The generated "design documents" are 2k+ lines long. This seems like a great way to exceed quotas.
Very cool, thanks for sharing. I imagine that this will make a lot of my fellow technical writers (even more) nervous about the future of our industry. I think the reality is more along the lines of:
* Previously, it was simply infeasible for most codebases to get a decent tutorial for one reason or another. E.g. the codebase is someone's side project and they don't have the time or energy to maintain docs, let alone a tutorial, which is widely regarded as one of the most labor-intensive types of docs.
* It's always been hard to persuade businesses to hire more technical writers because it's perenially hard to connect our work to the bottom or top line.
* We may actually see more demand for technical writers because it's now more feasible (and expected) for software projects of all types to have decent docs. The key future skill would be knowing how to orchestrate ML tools to produce (and update) docs.
(But I'm also under no delusion: it definitely possible for TWs to go the way of the dodo bird and animatronics professionals.)
I think I have a very good way to evaluate this "turn GitHub codebases into easy tutorials" tool but it'll take me a few days to write up. I'll post my first impressions to https://technicalwriting.dev
P.S. there has been a flurry of recent YC startups focused on automating docs. I think it's a tough space. The market is very fragmented. Because docs are such a widespread and common need I imagine that a lot of the best practices will get commoditized and open sourced (exactly like Pocket Flow is doing here)
This is super cool! I attempted to use this on a project and kept running into "This model's maximum context length is 200000 tokens. However, your messages resulted in 459974 tokens. Please reduce the length of the messages." I used open ai o4-mini. Is there an easy way to handle this gracefully? Basically if you had thoughts on how to make some tutorials for really large codebases or project directories?
As an extension to this general idea: AI generated interactive tutorials for software usage might be a good product. Assuming it was trained on the defined usage paths present in the code, it would be able to guide the user through those usages.
Yes! AI for docs is one of the usecases I’m bullish on. There is a nice feedback loop where these docs will help LLMs to understand your code too. You can write a GH action to check if your code change / release changes the docs, so they stay fresh. And run your tutorials to ensure that they remain correct.
do you have plans to expand this to include more advanced topics like architecture-level reasoning, refactoring patterns, or onboarding workflows for large-scale repositories?
I actually have created something very similar here: https://github.com/Black-Tusk-Data/crushmycode, although with a greater focus on 'pulling apart' the codebase for onboarding.
So many potential applications of the resultant knowledge graph.
Great stuff, I may try it with a local model. I think the core logic for the final output is all in the nodes.py file, so I guess one can try and tweak the prompts, or create a template system.
Really nice work and thank you for sharing. These are great demonstrations of the value of LLMs which help to go against the negative view on the impacts to junior engineers.
This helps bridge the gap of most projects lacking updated documentation.
Just need to find one way to integrate into the deployment pipeline and output some markdown (or other format) to send them to what ever your company is using (or simply a live website), I'd say.
This is definitely a cromulent idea, although I’ve realised lately that ChatGPT with search turned on is a great balance of tailoring to my exact use case and avoiding hallucinations.
I suppose I'm just a little bit bothered by your saying you "built an AI" when all the heavy lifting is done by a pretrained LLM. Saying you made an AI-based program or hell, even saying you made an AI agent, would be more genuine than saying you "built an AI" which is such an all-encompassing thing that I don't even know what it means. At the very least it should imply use of some sort of training via gradient descent though.
it appears like it's leveraging the docs and learned tokens more than the actual code. For example I don't believe it could achieve that understanding of levelDB without the prior knowledge and extensive material it's probably learned on already
I hate this language: "built an AI", did you train a new model to do this? Or are you in fact calling ChatGPT 4o, or Sonnet 3.7 with some specific prompts?
If you trained a model from scratch to do this I would say you "built an AI", but if you're just calling existing models in a loop then you didn't build an AI. You just wrote some prompts and loops and did some RAG. Which isn't building an AI and isn't particularly novel.
This is neat, but I did find an error in the output pretty quickly. (Disregard the mangled indentation)
# Use the Session as a context manager
with requests.Session() as s:
s.get('https://httpbin.org/cookies/set/contextcookie/abc')
response = s.get(url) # ???
print("Cookies sent within 'with' block:", response.json())
Show HN: I built an AI that turns GitHub codebases into easy tutorials
(github.com)893 points by zh2408 19 April 2025 | 170 comments
Comments
The prompt would also maybe be better if it encouraged variety in diagrams. For somethings, a flow chart would fit better than a sequence diagram (e.g., a durable state machine workflow written using AWS Step Functions).
This really shows off that the simple node graph, shared storage and utilities patterns you have defined in your PocketFlow framework are useful for helping the AI translate your documented design into (mostly) working code.
Impressive project!
See design doc https://github.com/The-Pocket/Tutorial-Codebase-Knowledge/bl...
And video https://m.youtube.com/watch?v=AFY67zOpbSo
I changed it to use this line:
instead of the default project/location option.and I changed it to use a different model:
I used the preview model because I got rate limited and the error message suggested it.I used this on a few projects from my employer:
- https://github.com/prime-framework/prime-mvc a largish open source MVC java framework my company uses. I'm not overly familiar with this, though I've read a lot of code written in this framework.
- https://github.com/FusionAuth/fusionauth-quickstart-ruby-on-... a smaller example application I reviewed and am quite familiar with.
- https://github.com/fusionauth/fusionauth-jwt a JWT java library that I've used but not contributed to.
Overall thoughts:
Lots of exclamation points.
Thorough overview, including of some things that were not application specific (rails routing).
Great analogies. Seems to lean on them pretty heavily.
Didn't see any inaccuracies in the tutorials I reviewed.
Pretty amazing overall!
Like at least one other person in the comments mentioned, I would like a slightly different tone.
Perhaps good feature would be a "style template", that can be chosen to match your preferred writing style.
I may submit a PR though not if it takes a lot of time.
The tutorial on requests looks uncanny for being generated with no prior context. The use cases and examples it gives are too specific. It is making up terminology, for concepts that are not mentioned once in the repository, like "functional api" and "hooks checkpoints". There must be thousands of tutorials on requests online that every AI was already trained on. How do we know that it is not using them?
from ollama import chat, ChatResponse
def call_llm(prompt, use_cache: bool = True, model="phi4") -> str: response: ChatResponse = chat( model=model, messages=[{ 'role': 'user', 'content': prompt, }] ) return response.message.content
https://passo.uno/whats-wrong-ai-generated-docs/
I think it should be possible to extract some more useful usage patterns by poking into related unit tests. How to use should be what matters to most tutorial readers.
PS: The generated "design documents" are 2k+ lines long. This seems like a great way to exceed quotas.
You built in in one afternoon? I need to figure out these mythical abilities.
I've thought about this idea few weeks back but could not figure out how to implement it.
Amazing job OP
You can tell that because simonw writes quite heavily-documented code an the logic is pretty straightforward, it helps the model a lot!
https://github.com/Florents-Tselai/Tutorial-Codebase-Knowled...
https://github.com/Florents-Tselai/Tutorial-Codebase-Knowled...
Can see some finetuning after generation being required, but assuming you know your own codebase that's not an issue anyway.
Put in postgres or redis codebase, get a good understanding and get going to contribute.
* Previously, it was simply infeasible for most codebases to get a decent tutorial for one reason or another. E.g. the codebase is someone's side project and they don't have the time or energy to maintain docs, let alone a tutorial, which is widely regarded as one of the most labor-intensive types of docs.
* It's always been hard to persuade businesses to hire more technical writers because it's perenially hard to connect our work to the bottom or top line.
* We may actually see more demand for technical writers because it's now more feasible (and expected) for software projects of all types to have decent docs. The key future skill would be knowing how to orchestrate ML tools to produce (and update) docs.
(But I'm also under no delusion: it definitely possible for TWs to go the way of the dodo bird and animatronics professionals.)
I think I have a very good way to evaluate this "turn GitHub codebases into easy tutorials" tool but it'll take me a few days to write up. I'll post my first impressions to https://technicalwriting.dev
P.S. there has been a flurry of recent YC startups focused on automating docs. I think it's a tough space. The market is very fragmented. Because docs are such a widespread and common need I imagine that a lot of the best practices will get commoditized and open sourced (exactly like Pocket Flow is doing here)
Looks inside
REST API calls
Thanks buddy! this will be very helpful !!
It seems a trifle... overexcited at times.
I wonder why all examples are from projects with great docs already so it doesn't even need to read the actual code.
If you trained a model from scratch to do this I would say you "built an AI", but if you're just calling existing models in a loop then you didn't build an AI. You just wrote some prompts and loops and did some RAG. Which isn't building an AI and isn't particularly novel.
With the rise of AI understanding software will become relatively easy