a genuine phonemic whistle encoding of toki pona
I've been fascinated by whistled languages for years. The problem: every system I could find —
Silbo Gomero, Mazatec, the Turkish bird language — works the same way. You already know the
spoken language, you mouth the words while whistling, and the melody of your internal
vocalization carries through. It sounds like a whistled language. It isn't really one.
A non-speaker can't learn it without first learning
the spoken version. The encoding isn't formally defined — it emerges from mouth shape and
changes speaker to speaker.
I'm experimenting with this as a way for humans to talk to each other over distance, and as
a lightweight communication layer for low-power edge devices — things like an ESP32-P4
that can't run heavy speech recognition but can absolutely detect and produce discrete
tone pairs.
Every kalama walo conversation opens with a rising arpeggio through all five scale degrees of the current mood. The first speaker plays it, the other responds. Both are now calibrated to the same root pitch — and have announced their emotional scale before a word is spoken. The opening sequence is the only reserved signal in the system. Everything else is just toki pona.
Because Kalama Walo defines phonemes as musical interval ratios rather than mouth-shape contours, it can be produced by any pitched instrument — a flute, a cello, a theremin, a keyboard in legato mode. The melody is the phoneme sequence. You cannot change the notes without changing the words.
This means you can compose pieces of music that simultaneously carry semantic content. A listener who knows the encoding hears words. A listener who doesn't hears a melody with emotional character — shaped by the emotional scale in use, structured by the grammar of the phrases.
Transposition: Because all intervals are relative to the session root, any Kalama Walo piece can be transposed to any key without changing the meaning. A soprano and a bass can have a conversation — only the intervals matter, not the absolute frequencies.
Harmony: The melody carries the semantic content. Two independent voices, each singing different Toki Pona words simultaneously, each grammatically correct, the combination creating a third meaning — this is possible in principle. It's a real constrained composition problem that nobody has tried in a designed interval-ratio whistled language yet. Worth exploring seriously.
Songs: The melody is the grammar. The scale mode is the emotion. Compose carefully and the result is a piece of music that is simultaneously a grammatically correct Toki Pona text — the notes are words.
A four-line love song where every note is a phoneme and the melody is the meaning. Composed in the sad scale (♭7) — the words say 'love exists' but the minor seventh gives it longing. The rhythm follows Toki Pona stress: first syllable of each word gets longer notes, unstressed syllables pass quickly, and phrase endings are held. Choose an emotional scale below — same words, same notes, completely different feeling.
Two voices, two sentences, sung simultaneously. Each is complete Toki Pona. Together they say something neither says alone.
mi awen — Voice 1
In Toki Pona, awen is one of the most weighted words in the language. It means stay, remain, wait, persist, continue, endure. It carries duration. It implies that staying is a choice being made against the possibility of not staying. mi awen is not just "I am here" — it is "I am still here. I have continued to be here. I chose to remain."
The melody rises gently, opens, and settles. It sounds like staying.
sina lon — Voice 2
Lon is presence, existence, truth, location. sina lon means you exist, you are present, you are real. It is a statement of someone's existence as a bare fact. Not "I know you're here" or "I see you" — just the fact of your presence as something that is true. In Toki Pona's philosophy of directness, sina lon is one of the most complete things you can say about another person.
The melody reaches, grounds, reaches again, and settles. It sounds like someone being present in a way that still feels remarkable.
together
"I remain / you are here" sung simultaneously is the complete architecture of longing and presence in six syllables. It is not a dialogue — no call and response, no question and answer. Both statements are made at the same time, independently, and together they describe the entire emotional situation: one person persisting, another person being present. The act of remaining and the fact of presence, simultaneous, inseparable.
What makes it specifically Toki Pona in its elegance: both sentences are stripped to the absolute minimum. There is no "because of you" in mi awen. There is no "and I'm glad" in sina lon. Toki Pona removes exactly the kind of embellishment that would make these sentences smaller by making them more explicit. The meaning that isn't said is louder than the meaning that is.
Thirds and fifths throughout. No parallel fifths. The two voices move in contrary motion — one rising while the other falls — which is the mark of good counterpoint in any tradition. What's striking is what the phoneme constraints chose: out of everything the language could have offered, the pair that works best as two-voice writing turns out to say "I remain / you are here." That could be coincidence. It could be that the structure of tonal beauty and the structure of Toki Pona have something in common that nobody has looked at closely before.
The rhythm amplifies the meaning. In voice 1 (mi awen), the Sol in awen is held for almost two beats — the classical suspension, stretched time on the act of staying. The final Mi holds for 2.5 beats. In voice 2 (sina lon), the Ti of sina is held long — presence stated with the most reaching interval in the scale, the note that wants to resolve but hasn't yet. Both voices end on long held notes, not quick ones. Remaining and being present are both things that take time.
Because Kalama Walo encodes phonemes as interval pairs, the melody of any phrase is fixed — every note is determined by the phoneme table. But you can sing different words to that same melody. A listener who knows Kalama Walo decodes both simultaneously: the melodic text from the intervals, and the sung text from the syllables. One voice. Two complete Toki Pona sentences. Heard at the same time.
This is a new form with no existing name. The closest things are contrafactum (new words on an existing melody) and quodlibet (two songs simultaneously), but neither captures it — because here both the melody and the sung syllables are in the same language, and both carry meaning that was intentionally designed to create something in the gap between them.
Name: kalama anpa — "the sound underneath" / "the hidden sound." The melodic meaning lives beneath the sung words, audible only to those who know to listen for it. The tune says one thing. The voice says another. Both are real.
Each example below plays the melody of the encoded phrase (what the Kalama Walo intervals spell). The sung phrase is what you vocalize over that melody — same tune, different words, both true. Play the melody, then try singing the alternative text over it.
seeing is knowing · knowing is seeing
A conversation between two instruments — flute and violin — in five movements.
Each instrument is a voice with its own character: the flute is higher, breathier, more curious.
The violin is lower, more sustained, more certain. They speak in turn, then discover something together.
The subject: the relationship between seeing and knowing.
In Toki Pona, lukin (to see, to look) and sona (to know, to understand) are close cousins.
The piece ends with both instruments simultaneously saying the same thought from opposite directions —
and the melodies, which are anagrammatic (same notes in different order), meet in a major third.
language / nothing / language · speak, then silence, then speak
A melodic palindrome — the tone sequence reads identically forwards and backwards. Playing it in reverse produces exactly the same music. Bach wrote one crab canon with this property in the Musical Offering. This is one, in Toki Pona, where the palindrome emerges from the phoneme encoding rather than being imposed on it.
toki (language, speak, hello) was found to be palindromic: its eight tones are Sol Do Do Ti · Ti Do Do Sol — symmetrical around the two Ti tones at the center. The word for language is, at the level of its phonemic melody, a mirror image of itself.
ala (nothing, not, none, zero) is also palindromic: Do Do Sol Sol Do Do. The word for nothing folds around itself.
toki ala toki — language, then nothing, then language again. Speech, then silence, then speech. The phrase is a palindrome of palindromes: each word palindromic, the whole palindromic. The rhythm is also palindromic — the duration of every note mirrors the duration of its counterpart from the other end. Playing it backwards produces exactly the same music, at the same tempo, with the same phrasing. You cannot tell which direction time is moving.
Any instrument producing discrete sustained pitches can perform Kalama Walo: flute, recorder, cello, violin (bowed, not plucked), theremin, keyboard in legato mode, or the singing voice. The performer needs the interval positions for a chosen root and the phoneme table — then the instrument speaks.
Plucked and struck instruments (guitar, piano) work but require attention to the attack transient. Bowed strings are ideal: sustained, controllable pitch, natural glide between intervals.
Two performers can have a conversation on instruments. A flute and a cello, across a courtyard. The music they make is also a dialogue in Toki Pona.
The mi awen · sina lon duet above demonstrates genuine two-voice Toki Pona counterpoint — both voices simultaneously singing different grammatically correct sentences, the combination creating a third meaning. What the phoneme constraints produced — thirds and fifths throughout, contrary motion, zero parallel fifths — turns out to say "I remain / you are here." The language and the counterpoint agreed on something.
Someone singing or speaking Toki Pona words aloud while an instrument plays Kalama Walo — either the same words or a different sentence — is immediately possible and musically interesting as a contrast between the two modes of the same language.
All 15 toki pona phonemes encoded as ordered tone pairs. The first tone encodes articulation position; the second encodes manner. Click any row to hear the phoneme pair synthesized.
| phoneme | pair | IPA | group | notes |
|---|
Five scale degrees: root (×1.000), major third (×1.250), perfect fifth (×1.500), major seventh (×1.875), octave (×2.000). Every toki pona phoneme is an ordered pair of two of these degrees.
| degree | ratio | maps to |
|---|---|---|
| 1 | 1.000 | all vowels (anchor tone) |
| 3 | 1.250 | bilabial consonants — P M W |
| 5 | 1.500 | alveolar consonants — T N L S |
| 7 | 1.875 | velar / palatal — K J |
| 8 | 2.000 | glottal — H + opening sequence |
Vowels: A=1-1, E=1-3, I=1-5, O=1-7, U=1-8. The mouth opens on 1-1 and closes progressively as the interval widens to the octave. The articulatory logic is the same for consonants — front of mouth anchors on degree 3, back of mouth on degree 7. The encoding is learnable from its own internal structure.
All intervals are relative — not absolute frequencies. The opening sequence establishes the session root. Both speakers calibrate to each other, not to a fixed pitch standard.
The speaker substitutes one scale degree to change the emotional quality of everything they say — exactly as major and minor keys work in music. Same words, different scale, completely different felt quality.
| mood | substitution | effect |
|---|---|---|
| neutral | 1 3 5 7 8 | standard — informational |
| sad | 1 3 5 ♭7 8 | minor seventh — plaintive, unresolved |
| content | 1 3 5 6 8 | major sixth — relaxed, swinging |
| tense | 1 ♭3 5 7 8 | minor third — anxious, watchful |
| excited | 1 3 5 7 9 | ninth — expansive, reaching past itself |
| uneasy | 1 3 ♭5 7 8 | tritone — deeply unsettled |
The opening sequence itself announces the speaker's mood — an excited speaker's arpeggio reaches further than a neutral one. The ceremony carries meaning before the first word.
Because kalama walo is a genuine phonemic encoding — not a mouth-shape performance — any two people who know the system can whistle fluent toki pona to each other across distances where speech doesn't carry. A noisy trail. A windy hillside. Between boats on a canal. A crowded market where you don't want to be overheard.
The discrete tone pairs cut through ambient noise better than consonant-heavy speech. The musical quality means bystanders register it as someone whistling, not as communication. Two people exchanging kalama walo across a valley sounds like birdsong. That's not a bug.
Toki pona's 137-word vocabulary is small enough that two people can become genuinely fluent in the encoding in an afternoon — and the emotional scale system means you convey not just words but feeling in the same breath.
This is a browser-based synthesizer for kalama walo. The voice synthesis uses the Web Audio API to produce whistled tones with vibrato, glide between the two tones of each pair, optional breathiness and harmonic content, reverb, echo, and chorus. The presets suggest different voice characters — from a pure clean whistle to something more robotic or ghostly. Everything is adjustable.
The file works completely offline once downloaded — single HTML file, no external dependencies. Feel free to share it.
Built by The Orange Garage. Kalama walo is original work. Toki pona is the creation of Sonja Lang — tokipona.org.