Unicode operators for semantically correct programming
Why do most programming languages use the /
character when we have a perfectly good ÷
symbol? Similarly, why use !=
instead of ≠
? Or =>
rather than →
?
The obvious answer is that the humble keyboard usually only has around 100 keys - and most humans have a hard time remembering where thousands of alternate characters are.
Some programming fonts attempt to get around this with ligatures. That allows the user to type <=
but have the font display ≤
Are there any modern programming languages which allow the use of semantically correct Unicode symbols as operators?
As far as I can tell, there's only one!
Scala
Here's a trivial example which creates the ÷
operator:
SCALAcase class A(v: Float) {
def ÷(b: A): A = A(v / b.v)
}
val a = A(9)
val b = A(5)
a ÷ b
According to the Scala reference documentation, valid characters for operators include "Unicode categories Sm, So".
That's "Symbols, maths" and "Symbols, others". That covers quite a lot of useful symbols. Including some Emoji! So you could have ☺
as an operator.
Others
I took a quick scan through some other modern languages:
- Ruby
- Ruby has a Unicode Math gem which lets you write code like
∛ 27
. But, as far as I can tell, there's no way to create your own operators.
- Ruby has a Unicode Math gem which lets you write code like
- Haskell
- There's an extension for some Unicode characters. But there's not many there and no way to create new ones.
- Python
- Nope. There's no way to create your own operators. And function names can only contain a subset of Unicode.
- C++
- Nope. You can overload existing operators but cannot define your own.
- Go
- No.
- Rust
- Java
- Nuh-uh.
- Javascript
- WAT.
In fact, according to the inimitable Xah Lee the only other languages which allow user-defined Unicode operators are Julia and Wolfram.
Julia, I had some difficulty with, but this works:
JULIA☺(x)=sqrt(x)
print(☺(25))
÷(x,y)=x/y
print(25 ÷ 6)
I am, sadly, not clever enough to even understand the documentation for the Wolfram language.
What next?
Obviously I am now a convert to Scala and will henceforth rewrite all my code in it using multiple Unicode symbols.
But, more practically, I wonder if there's demand for a (new) programming language which treats Unicode operators a first class citizens? Or perhaps future versions of your favourite language should embrace the loving warmth of the Unicode consortium?
Thoughts?
@Edent Julia supports it out-of-the-box. Write div<tab> in REPL (or VS Code) and it will insert ÷ that works as a division symbol https://docs.julialang.org/en/v1/manual/mathematical-operations/.Also, I believe we've already been there and it ended up quite ugly https://en.wikipedia.org/wiki/APL_(programming_language)
@Edent Haskell supports unicode identifiers, and defining your own operators, hence... https://hackage.haskell.org/package/base-unicode-symbols-0.2.4.2/docs/Data-Ord-Unicode.html Data.Ord.Unicode
@Edent Bring back APL (and for that matter the Vienna Development Method & Z for specification)
Gustav Lindqvist 🇸🇪: "@Edent@mastodon.social I think the answer to the …" - Mastodon (jkpg.rocks) said on :
This Article was mentioned on jkpg.rocks
Sam says:
I was going to make a joke about perl, but I checked and raku does treat ÷ as a divide
raku -e'say 1÷2'
https://docs.raku.org/language/unicode_ascii#Other_acceptable_single_codepoints
@EdentRaku (formerly known as Perl 6) has Unicode operators (https://docs.raku.org/language/unicode_ascii) and lets you define your own (https://docs.raku.org/language/optut) Unicode versus ASCII symbols
@Edent An argument against ÷ would be the fact that it's not really used in mathematics outside of elementary schooling in some countries, mostly English-speaking ones.
dd says:
Julia doesn't just let the user define Unicode operators, but also has many of them predefined, such as the "less than or equal" and "not equal" you mentioned. The ÷ division operator is also part of the standard library, but doesn't have the same semantics as the "plain ASCII" division operator:
5.0 / 3.0 == 1.666... 5 / 3 == 5//3 # a rational number
But ÷ only returns the quotient, just like dividing integers in C:
5 ÷ 3 == 1 5.0 ÷ 3.0 == 1.0
Square roots are also available (as an operator): √5 == 2.236...
Most dependently-typed languages support unicode.
Both adga and lean support unicode (in functions) and designing your custom operators. It's very useful to write proofs about math:
Here's a simple example from Lean:
lemma le_lim {x y : ℝ} {u : ℕ → ℝ} (hu : seq_limit u x) (ineg : ∃ N, ∀ n ≥ N, y ≤ u n) : y ≤ x := begin -- sorry apply le_of_le_add_all, intros ε ε_pos, cases hu ε ε_pos with N hN, cases ineg with N' hN', let N₀ := max N N', specialize hN N₀ (le_max_left N N'), specialize hN' N₀ (le_max_right N N'), rw abs_le at hN, linarith, -- sorry end
@Edent I use and love one of the programming fonts which use ligatures to show appropriate symbols. I get the benefits of typing things on my keyboard that just work with the keyboard and the compiler, while my eyes see things which work for my brain. Win-win!:)
Lawrence says:
Swift allows it: https://docs.swift.org/swift-book/ReferenceManual/LexicalStructure.html#ID418
Dr. Winston O'Boogie says:
zHaskell supports defining operators using unicode out of the box:
$ docker run -it --rm haskell:9 GHCi, version 9.10.1: https://www.haskell.org/ghc/ 😕 for help ghci> a ÷ b = a / b ghci> 9 ÷ 5 1.8
The list you link to is the supported unicode equivalents for builtin operator keywords.
@Edent uiua uses a bunch of Unicode symbols. It even has a nice interpreter which will automatically translate keywords into the relevant symbols.
uiua.org
Toby Jaffey says:
Zig can sort of do it. There's no operator overloading, but it does have generics. It also doesn't allow unicode directly in literals, but it works via the @".." syntax, which is for accessing otherwise illegal literals.
This is legal
@Edent A couple of comments mention #rakulang , so tagging it for visibility. But, yes, choice of Ascii or unicode representations of many operators and option to define your own
rakulang
@Edent Get out your physical APL keyboard from the 1980s and code like you were supposed to.
Unicomp GA LLC: Unicomp Ultra Classic US APL Black Buckling Spring 104 Key USB Keyboard
@Edent I even get "how on earth am I supposed to use this?" for code with variables called ΔE or ε
@Edent when I tried out Julia last month I was slightly freaked by being able to use latex symbols in my code. Just seems wrong to me, but I'm not really a mathematician so some of the symbols were the problem.
@Edent Origami makes use of → as a pipe operator.
https://weborigami.org/language/syntax#pipe-operator
Origami language syntax
More comments on Mastodon.