Jibo Workshop

Code examples from the Jibo Workshop. Each numbered script in scripts/ matches a step in the workshop, and pipelines/ has the advanced GPT examples. Both Node.js and Python versions are available.

Note: On macOS (and some Linux systems), python and pip may not be recognized. If that happens, use python3 and pip3 instead (e.g. python3 scripts/python/01_hello.py).

Setup

Make sure Jibo is in rosbridge mode: open http://your-jibo-hostname.local:9090 in your browser and press the "Enter Rosbridge" button in the top right. You should see a robot image appear on Jibo's screen.

git clone https://github.com/mitmedialab/jibo-workshop.git
cd jibo-workshop
cp .env.example .env

Edit .env and set JIBO_HOST to your robot's hostname (printed on the bottom of the base) or IP address. For the GPT pipelines, also set OPENAI_API_KEY.

JavaScript dependencies

npm install

Python dependencies

pip install roslibpy python-dotenv websockets

Scripts

Every example is available in both JS and Python. They do the same thing — pick whichever language you prefer.

#	What it does	JS	Python
01	Make Jibo speak	`node scripts/js/01_hello.js`	`python scripts/python/01_hello.py`
02	Play an animation	`node scripts/js/02_animate.js`	`python scripts/python/02_animate.py`
03	LED ring control	`node scripts/js/03_led.js`	`python scripts/python/03_led.py`
04	Play a sound effect	`node scripts/js/04_sound.js`	`python scripts/python/04_sound.py`
05	Stop current action	`node scripts/js/05_stop.js`	`python scripts/python/05_stop.py`
06	Listen to Jibo's state	`node scripts/js/06_state.js`	`python scripts/python/06_state.py`
07	Speak + animate simultaneously	`node scripts/js/07_speak_animate.js`	`python3 scripts/python/07_speak_animate.py`
08	Chain actions in sequence	`node scripts/js/08_chain.js`	`python scripts/python/08_chain.py`
09	Cycle LED colors	`node scripts/js/09_led_cycle.js`	`python scripts/python/09_led_cycle.py`
10	Capture microphone audio	`node scripts/js/10_mic.js`	`python scripts/python/10_mic.py`
11	Capture a photo	`node scripts/js/11_camera.js`	`python scripts/python/11_camera.py`

Pipelines

These are standalone scripts that connect Jibo to OpenAI for bilingual Arabic/English AI conversations. They build on the basics from scripts/ (mic streaming, camera capture, TTS) and add GPT on top. Requires OPENAI_API_KEY in .env.

File	What it does
`jibo_realtime.js` / `.py`	Live voice conversation via OpenAI Realtime API
`jibo_whisper_gpt.js` / `.py`	Step-by-step Whisper STT + GPT-4o pipeline
`jibo_vision.js` / `.py`	Camera photo + GPT-4o Vision analysis

Live Voice Conversation (Realtime API)

node pipelines/js/jibo_realtime.js        # JS
python pipelines/python/jibo_realtime.py  # Python

Streams Jibo's microphone directly to OpenAI's Realtime API over WebSocket for real-time voice conversation. Jibo naturally mixes Arabic and English in its responses. The mic is muted while Jibo speaks to prevent it from hearing itself.

Flow: Jibo mic (port 3838) -> OpenAI Realtime API -> GPT generates text -> Jibo speaks via TTS -> mic unmutes -> repeat

Whisper + GPT-4o

node pipelines/js/jibo_whisper_gpt.js        # JS
python pipelines/python/jibo_whisper_gpt.py  # Python

A step-by-step pipeline that gives you more control than the Realtime API. Each stage runs independently, so you can swap out Whisper for another STT or change GPT models without touching the rest.

Flow: Jibo mic records (VAD auto-stop) -> Whisper transcribes -> GPT-4o generates bilingual response -> Jibo speaks -> loop

VAD (Voice Activity Detection) runs locally to detect when you start and stop speaking. The defaults work well for most rooms, but you can tune them directly in the file if needed.

Camera + GPT-4o Vision

node pipelines/js/jibo_vision.js                              # JS
python pipelines/python/jibo_vision.py                        # Python

# Options (same for both):
node pipelines/js/jibo_vision.js --loop                       # continuous mode
node pipelines/js/jibo_vision.js --prompt "what color is this?"
python pipelines/python/jibo_vision.py --loop
python pipelines/python/jibo_vision.py --prompt "what color is this?"

Captures a photo from Jibo's camera (port 8486) and sends it to GPT-4o Vision for analysis. Jibo describes what it sees in mixed Arabic/English. In loop mode it takes a new photo every 8 seconds.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
pipelines		pipelines
scripts		scripts
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Jibo Workshop

Setup

JavaScript dependencies

Python dependencies

Scripts

Pipelines

Live Voice Conversation (Realtime API)

Whisper + GPT-4o

Camera + GPT-4o Vision

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Jibo Workshop

Setup

JavaScript dependencies

Python dependencies

Scripts

Pipelines

Live Voice Conversation (Realtime API)

Whisper + GPT-4o

Camera + GPT-4o Vision

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages