How To: Identify and Fix a Bug, Build Your Own Cosmo #1439
BradHutchings
started this conversation in
General
Replies: 1 comment
-
|
You're my hero, Brad. Good to see the debugging process and how to push through initially unknown issues |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
While the README walkthrough was helpful getting started with Cosmo, it took a lot of digging around and playing around to figure out the path from "I think there's a problem" to fixing it. So here is my journey.
Inspired by llamafile, I forked llama.cpp to give it the Cosmo treatment — one binary cross architecture and platform + package some supporting files.
My project is Mmojo-Server. It's open source + I periodically publish builds on HuggingFace. I also package Mmojo Server in an appliance based currently on Raspberry Pi. I hope to offer Intel NUC, AMD Ryzen AI, and Mac Mini appliances soon.
The problem was that when run on x86, it was crashing after loading some specific models: IBM Granite v3.3 and Meta Llama v3.2.
Run in
gdbto catch the crash:It shows me:
Alternatively, I can run with the
--straceflag, redirect to a file:I see this at the end:
The first thing to figure out is why it's crashing.
gdbtells us thatmemchr_sseis being called withNULL, 0, 0params. Looking at the implementation ofmemchr_sseonlibc/intrin/memchr.c, you can see there is no checking forNULLbefore derefencingsin a call to_mm_cmpeq_epi8(inlined?):Meanwhile, the non-SSE version effectively checks for
n == 0in theforstatement:That explains why there's a crash here on x86 but not ARM.
So, do I fix it here or find something upstream? The function trace log, when I scroll up, tells me this is happening in processing a regular expression as part of Google Minja. Due to the size of that function trace log, it proved too diffcult to figure out the culprit calling that in the llama.cpp code.
I decided to put a parameter check in
memchr_sse():Now I have to clone the Cosmopolitan repo, make a change, and build the repo. I hadn't had to do that before, and this took me a little more than a day to figure out. Specifically, I stumbled on the
tool/cosmocc/package.shscript only by digging through the repo file by file looking for some clue. LOL.At this point, there is a directory with everything built:
~/$BUILD_COSMOPOLITAN_DIR/cosmocc.Instead of downloading the latest from cosmo.zip, copy that directory to where you need it locally. I like to make a copy so there's no chance of wrecking what I just built.
I build Mmojo-Server as detailed here. After that, I configure the Cosmo build as detailed here.
Voila, no more crash when loading those particular models.
Hopefully this will save someone some time. Or could be integrated clearly into the repo README.
Cosmopolitan is an amazing little project! If I had $1M to throw at it, I would find a business model. The key point is that making all platform decisions at compile time (like llama.cpp does) limits the user base to people who know how to build. My user base isn't even great with running stuff from a command line. And why I'm selling an appliance. Also, CPU isn't dead, especially with small LLM inference.
-Brad
Beta Was this translation helpful? Give feedback.
All reactions