Do open source licences cover the Ship of Theseus?


I recently downloaded a single-page HTML template for a project I was working on. I wanted a good-looking scaffold to help me getting running quickly. The code had an attribution licence which I was happy to comply with.

I ended up removing about a whole bunch of the HTML that I didn't need. That also allowed me to remove the majority of the CSS which was unused. I deleted all the JavaScript. I added some semantic markup and updated a few of the outdated coding conventions. Newer CSS was also added to support modern features. And I replaced all the default images and fonts with something I preferred.

In total, 75% of the HTML was rewritten and 61% of the CSS had changed.

Screenshot from GitLab showing 2 files with 167 additions and 562 deletions.

Is there enough of the original files left to warrant attribution according to the licence terms?

Let's take it to an extreme. Suppose I really loved the background colour used by a piece of free software. If all I copied wasbody { background: #6082B6; } would that require attribution?

I think there's a reasonable argument that de minimis non curat lex - the law cares not for small things. Is anyone seriously going to argue that I stole half a dozen bytes? Could they prove that I copied that single line from them? Would anyone care?

And yet, morally, I feel that I should give credit.

Much like the apocryphal sculptor, I have removed everything that wasn't necessary. But I think the poor sod who lugged the block of marble deserves acknowledgment.

At what point do you say "this has changed so much that it is no longer necessary to abide by the original licence"?


Share this post on…

  • Mastodon
  • Facebook
  • LinkedIn
  • BlueSky
  • Threads
  • Reddit
  • HackerNews
  • Lobsters
  • WhatsApp
  • Telegram

12 thoughts on “Do open source licences cover the Ship of Theseus?”

  1. said on tech.lgbt:

    @Edent
    I tend to leave the attribution in regardless of how much had changed. Even if theres hardly anything left of the original, to me it's not the code as much as it is the code plus (ideas, inspiration, comments, links to others, etc.). So the attribution is not so much to thank for the code, but to direct others to a helpful or inspiring person.

    Reply | Reply to original comment on tech.lgbt
  2. Alex Gibson says:

    A very real concern and one I've had in 3D printer design - in such a heavily derivative industry, even if you create something from scratch in a clean room, there will be a heavy heritage of design influence.

    If I sunk your ship of Theseus, or hacked your workplace and destroyed every copy of your HTML page, what would be the most expeditious way for you to rebuild it? Would you need to go back to the original example page, to 'dive the wreck' for any of those nuggets you copied and retained? Or would you start from a blank page and start typing?

    How much actual work would it take to refactor every unchanged line with your own code - not just tweaking a character to game the system? And how much meaningful impact would this make to the end product?

    I think you've rightly identified a moral obligation to at least credit the previous developer for the leg-up they gave you. "Inspired by". If you then choose NOT to carry forward any licence obligations of the original IP, you'd have to be prepared to justify why you feel your new code is sufficiently far removed from the original that the 25% of HTML lines and 39% of CSS lines that carried over unchanged are 'de minimis' contribution to the final product.

    Good coders are productively lazy and like good artists, copy snippets from all over the place, from datasheets to code libraries to their own previous work. But only you would be able to look only them in the eye and say those lines could have come from anywhere.

    Reply
  3. KeyXote says:

    An interesting introspective into the Labarynth of paradoxical derivation, intuition, ingenouity and creativity amongst others. Philosophically this Socratic prompt contains many fields of mind, many layers of complexity and depth. A mirror maze of sorts, a game. Does Pac-Man give credit to Dedeleus, is Pac-Man aware of the construct, of the ghosts? Personally I would credit to anyone who has placed the effort in building the construct no matter the size of their contribution. We all reflect each other in the collective commons of ideas. A theory may be born to an individual a priori to the knowledge that a proof or idea already exists, and for that matter, always may have existed. Knowledge and acknowledgment of the initial creator/creation is a reflection of self in many ways. An idea is tangential and remains constrained until it is shared with others, at which point one can reflectivity appreciate its shared existence. I see it akin to a social contract of sorts, agreements that an experience exists requires a consensus. We all tap into the pool of past, present and future knowledge when we create our own labyrinths. I would prefer to navigate that construct with other like minded creators I can share credit with as it takes shape into many forms. Having creative allies helps avoid the Minator. From a legal perspective we follow agreed upon rules, that said there are unwritten agreements to which we also adhere. De Jour and De Facto can be delineated but only in one aspect is it an unrecorded contract. The Ship of Theseus does not have to be a Corsair, plundering ideas for self edification leads the Minator of mind to devour itself, makes me reflect on some of the works of Heironymous Bosch.

    Reply
  4. Warner Losh says:

    in the open source project I work on we use "more than half" as our rule of thumb for a rewrite that adds a name to the copyright. and "nearly all" to remove a name. these are likely reasonably conservative guidelines. And it isn't lines of code strictly, but more of an implied "of the content" where different bits can have different weights as to what went into the "work" that was created. plus we tend to bias towards keeping attribution when in doubt to be polite and neighborly.

    it is a thorny problem to get an absolute answer. legally, it's when the "reduced form" of the work differs enough to not infringe. there are many tests in legal cases, but they all involve breaking to code down to basic blocks, ignoring differences in variable names, removing the things that can only be done one way, etc.

    so a color line, like the example you gave, has no copyright protection because red is red, you can't copyright facts (eg the numbers that make up a color) and there is one or a small number of ways in the language to say red.

    Reply
  5. Mark Atwood says:

    This is my actual professional day job field of expertise.

    The answer is... It Depends.

    This is the reason why dayjob uses MIT0 or 0BSD for everything we want or expect customers do with the open source code we publish.

    Reply
  6. says:

    in the open source project I work on we use "more than half" as our rule of thumb for a rewrite that adds a name to the copyright. and "nearly all" to remove a name. these are likely reasonably conservative guidelines.

    Those are not remotely conservative guidelines. And probably not even reasonable guidelines.

    Reply
  7. Michael Brian Bentley says:

    Say exactly how you used it for its attribution to be understood. It is the basis of your project, it got you started, and that's a good thing. For you it was example code and not a finished component.

    Reply
    1. KeyXote says:

      Akin to "Finding Forrester" in a way. my question is how no navigate Intellectual Property when/if GPT is utilized and produces a result with an unspecified amount of code requiring license attribution. Have limited experience with it so far, I suppose there could be embeddable ways to apply that but unsure.

      Reply
  8. says:

    The BSD operating systems sort of did the Ship of Theseus with Unix: it started out as extensions to Unix but by the early 90s it only required a relatively small amount of code to be replaced in order to be completely free of the original copyright.

    Beyond that I think your “de minimis” argument is right: not everything is copyrightable, some things are too small/simple. By analogy: an author doesn’t get copyright over every single word they use in their writing. (Though credit for inspiration might feel appropriate in any case.)

    Reply
  9. Martin L says:

    While I empathize with the feeling of "I rewrote or tossed out so much, the original copyright and license don't matter"... that's not the right criteria.

    Instinctively we know that some code isn’t really worth it copyright-wise. There is a criteria. But it’s got nothing to do with % of code. If you rewrote 1% or 99%, you still have to consider…

    https://en.wikipedia.org/wiki/Abstraction-Filtration-Comparison_test

    It’s a non-trivial process, so generally speaking if you’re using someone else’s code, just take/keep their copyright notice, and respect their license.

    Reply
  10. Kra says:

    Public-facing example code, be it in a blog post, README or docs/ or even examples/ should be public domain, otherwise it feels like teasing the readers, for their egos, propagandas, laziness or whatever. A ship formed by lots of pieces among these "de minimis" parts from derivates of derivates looks like another ship of Theseus to me, which is more suitable for a custom license. I'd like to give credits to non-trivial example codes. Reading a tutorial full of warnings about licensing sucks. When "the code is the spec" I would refrain from directly looking into the source code for stealing the ideas or algorithms, but well I think guys have "cloned/ported" MIT/BSD code into GPL code and vice versa, even today, and the unaware me is using their code and programs. If I don't want my mind tainted further, I should never crack open their source code. The % doesn't matter in my morality game, although I dislike this game about temptation of free open source code.

    Reply

What are your reckons?

All comments are moderated and may not be published immediately. Your email address will not be published.

Allowed HTML: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> <p> <pre> <br> <img src="" alt="" title="" srcset="">