Unicode operators for semantically correct programming


Why do most programming languages use the / character when we have a perfectly good ÷ symbol? Similarly, why use != instead of ? Or => rather than ?

The obvious answer is that the humble keyboard usually only has around 100 keys - and most humans have a hard time remembering where thousands of alternate characters are.

Some programming fonts attempt to get around this with ligatures. That allows the user to type <= but have the font display

Are there any modern programming languages which allow the use of semantically correct Unicode symbols as operators?

As far as I can tell, there's only one!

Scala

Here's a trivial example which creates the ÷ operator:

 SCALAcase class A(v: Float) {
  def ÷(b: A): A = A(v / b.v)
}
val a = A(9)
val b = A(5)
a ÷ b

According to the Scala reference documentation, valid characters for operators include "Unicode categories Sm, So".

That's "Symbols, maths" and "Symbols, others". That covers quite a lot of useful symbols. Including some Emoji! So you could have as an operator.

Try it in your browser.

Others

I took a quick scan through some other modern languages:

In fact, according to the inimitable Xah Lee the only other languages which allow user-defined Unicode operators are Julia and Wolfram.

Julia, I had some difficulty with, but this works:

 JULIA(x)=sqrt(x)
print((25))

÷(x,y)=x/y
print(25 ÷ 6)

I am, sadly, not clever enough to even understand the documentation for the Wolfram language.

What next?

Obviously I am now a convert to Scala and will henceforth rewrite all my code in it using multiple Unicode symbols.

But, more practically, I wonder if there's demand for a (new) programming language which treats Unicode operators a first class citizens? Or perhaps future versions of your favourite language should embrace the loving warmth of the Unicode consortium?

Thoughts?


Share this post on…

19 thoughts on “Unicode operators for semantically correct programming”

  1. dd says:

    Julia doesn't just let the user define Unicode operators, but also has many of them predefined, such as the "less than or equal" and "not equal" you mentioned. The ÷ division operator is also part of the standard library, but doesn't have the same semantics as the "plain ASCII" division operator:

    5.0 / 3.0 == 1.666...
    5 / 3 == 5//3 # a rational number

    But ÷ only returns the quotient, just like dividing integers in C:

    5 ÷ 3 == 1
    5.0 ÷ 3.0 == 1.0

    Square roots are also available (as an operator): √5 == 2.236...

    Reply

  2. says:

    Most dependently-typed languages support unicode.

    Both adga and lean support unicode (in functions) and designing your custom operators. It's very useful to write proofs about math:

    Here's a simple example from Lean:

    lemma le_lim {x y : ℝ} {u : ℕ → ℝ} (hu : seq_limit u x)
    (ineg : ∃ N, ∀ n ≥ N, y ≤ u n) : y ≤ x :=
    begin
    -- sorry
    apply le_of_le_add_all,
    intros ε ε_pos,
    cases hu ε ε_pos with N hN,
    cases ineg with N' hN',
    let N₀ := max N N',
    specialize hN N₀ (le_max_left N N'),
    specialize hN' N₀ (le_max_right N N'),
    rw abs_le at hN,
    linarith,
    -- sorry
    end

    Reply

  3. Dr. Winston O'Boogie says:

    zHaskell supports defining operators using unicode out of the box:

    $ docker run -it --rm haskell:9 GHCi, version 9.10.1: https://www.haskell.org/ghc/ 😕 for help ghci> a ÷ b = a / b ghci> 9 ÷ 5 1.8

    The list you link to is the supported unicode equivalents for builtin operator keywords.

    Reply

  4. Toby Jaffey says:

    Zig can sort of do it. There's no operator overloading, but it does have generics. It also doesn't allow unicode directly in literals, but it works via the @".." syntax, which is for accessing otherwise illegal literals.

    This is legal

    const std = @import("std");
    fn @"÷"(comptime T:type, a:T, b:T) T {
        return a/b;
    }
    
    fn @"²"(comptime T:type, a:T) T {
        return a*a;
    }
    
    pub fn main() !void {
        std.debug.print("{d}\n", .{@"÷"(f32, 10, 2)});
        std.debug.print("{d}\n", .{@"²"(i32, 7)});
    }
    

    Reply

What are your reckons?

All comments are moderated and may not be published immediately. Your email address will not be published.

See allowed HTML elements: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> <p> <pre> <br> <img src="" alt="" title="" srcset="">