[Vp-reproduce-subgroup] "Models are not consistently licensed"

Jacob Barhak jacob.barhak at gmail.com
Thu Dec 30 12:33:00 PST 2021


Hi Sheriff,

This is the thread your text should have been placed. I am copying it for
you to maintain continuity.

This is the text Sheriff wrote in a different thread:

"

Indeed often models are hard to locate. Hence in our reproducibility study
( Tiwari et al. 2021, https://doi.org/10.15252/msb.20209982), we strongly
recommend authors to make their models public, but more specifically
through public repositories such as BioModels (
https://www.ebi.ac.uk/biomodels). We recommend authors to submit model
codes, parameter sets, and simulation conditions needed to reproduce
simulation studies.

BioModels provides model curation and annotation service, where the models'
reproducibility is assessed and annotated with controlled vocabularies. The
BioModels offers sophisticated search engines to search and locate models
and model components (https://www.ebi.ac.uk/biomodels/parameterSearch).
Overall this will greatly facilitate model reuse.

More importantly, all models are disseminated under the permissive CC0
license in BioModels. This is because it is crucial to openly release the
models developed through public funding. Also, curation services done at
BioModels are supported by public funding.

Although few users suggested we use CC-BY, we decided to continue with CC0.
However, we strongly recommend our users to properly cite the model when
reused and most researchers normally do. Also, If a modeler collects and
integrates components from several hundred models, it may become difficult
to properly acknowledge them all.


"

If anyone has any other contributions to this thread about licensing,
please reply to this message so we can collect those ideas and organize
them in the response to reviewers.

Hopefully people will have time for this during the holiday.


         Jacob





On Tue, May 25, 2021, 07:39 Jacob Barhak <jacob.barhak at gmail.com> wrote:

> Ok William,
>
> You really want to go into details, so let us do so. - and I will try to
> be brief, because it is an endless topic and I can really go on for a long
> time. Although brief is relative.
>
> You write that:
> "The claim that we can’t make software out of pieces with different
> licenses is demonstrably false."
> This is not entirely false, you can indeed combine different pieces in
> some conditions, yet your code can become so messy and problematic to
> transport that many times you may be better not reusing some piece of code.
> And in any case, you have multiple restrictions and many times cannot
> distribute the code together.
>
> And abandoned code is really a problem, this is why public domain licenses
> started appearing in the last decade and copyleft licenses are not used as
> much as they used to - open source was pretty good idea at a time a lot of
> code was proprietary, yet the problems of copyleft and code that needed
> relicensing started appearing and there was a need for a new solution. It
> took about a generation for public domain licenses to appear - and we are
> just starting to experience it.
>
> You write:
> "The claim that we can’t use software abandoned by the original authors is
> also false."
> The issue is that you are confined to a specific license that may not be
> compatible with what you want to achieve and if you integrate a library
> into your code and build upon it for a long time, this becomes a
> technological debt you have to carry. The problem becomes much harsher when
> you try to do something that interfaces with the commercial world that
> imposes requirements on licenses - if you have abandoned code you
> integrated - you may be stuck since you cannot change the license. This is
> also true if code was not abandoned and just has many collaborators - you
> need to trace them all and ask them all to agree to a different license -
> the more contributors you have , the more problematic it is - this
> eventually can make some code practically unusable in some circumstances.
> And in situations like this COVID pandemic where you need to act fast to
> achieve results and some licensing problem appears - believe me, it is not
> an easy situation when there is time pressure. If you use a public domain
> license - all this disappears and you can innovate quickly. Also a lot of
> the bureaucracy disappears - making life so much easier.
>
> As for patents, those exist and will be used by commercial and scientific
> entities alike - Many university faculty members hold patents - and
> Universities sometimes have departments that support and encourage the
> creation of patents - so those will exist as long as law supports it. In
> fact, if you look at NIH policy, you will find out that it allows patents
> and assigns intellectual property rights arising from grants to the
> awardees. There were several attempts at making research products free and
> accessible by the public in the US, yet those did not catch on so far. I
> can write about those in a different email - yet I am trying to stay on
> point here. So like it or not, people will restrict what you can do. And
> even an open source license does not protect you from an orthogonal legal
> restriction. However, licenses like CC0 at least inform you about it and
> remove at least one restriction - which is more than many other licenses
> do.
>
> And as for attribution - nothing says you cannot attribute the work when
> using CC0 - in fact CC0 mentions the entity releasing the code to the
> public - it is just that you are not demanded to do so. You can always give
> credit - you are just not required - so scientific practice is not
> disrupted by "public domain" licenses - it is just made easier. Also, you
> can release the same work under multiple licenses - one that demands
> attribution and one that waives copyright restrictions and let the user
> choose which one they wish to propagate. So if you think about it, once you
> can release it to the public domain, any other license is just a
> restriction on the party trying to reuse your work regardless of how
> liberal you think the license is.
>
> And you mention BSD/MIT licenses - remember, those are still forms of
> protection of intellectual property - copyright based. Regardless of how
> liberal you describe them to be, you are still tied to the original
> contributors if you need anything changed in the license due to some
> incompatibility which leads us back to the original issue of license
> compatibility.
>
> Also, when a license is copyright based it depends on who the registered
> owner is and as you may have seen in our discussion on this mailing list,
> different institutions have different policies on ownership. So it becomes
> messy again - there is really no uniformity - it is all situational and
> based on interests of the owners.
>
> And you mention OSI - it is only one organization that catalogues open
> source licenses. There are also Creative Commons,  Free Software
> Foundation,  and the Open Knowledge Foundation. And they have different
> perspectives and I must add that OSI is behind in adopting the new
> generation of public domain licenses, so perhaps it is better to choose
> another entity like Creative Commons for licenses. In fact, we both know of
> one COVID modeling platform that is released now under Creative Commons
> license rather than the traditional licenses.
>
> I agree with you that releasing a model/code without a license is
> problematic - it is actually a strong copyright restriction that equals
> "all rights reserved". So this is highly non recommended unless you really
> want to restrict.
>
> The reason this discussion is taking place is because we have a section
> about it in the paper and we do mention public domain licenses.
>
> In fact Biomodels, the repository where many biological models are stored,
> made the correct choice of license and stores models under CC0 - this means
> that those models can be reused much easier.
>
> I don't know what led to this decision by BioModels - perhaps Sheriff can
> tell us the story, yet I believe their decision was smart and correct.
> There are currently over 1,000 curated models in that repository and
> hopefully this number will grow quickly so we will have a large public
> repository that allows model reuse with an easy to use license interface.
>
> Think about it long term, if you really want modeling technologies to be
> widely adopted, you need to make them very accessible and if you want to
> integrate them, you want to remove as much bureaucracy as possible. Think
> about a future where hundreds of those models will have to be automatically
> merged together in ensembles by machine in attempts to explain observed
> biological phenomena. We are still far from that point, yet if we resolve
> the problems we listed in the paper we wrote together we will be closer to
> such a future solution. And fortunately BioModels resolved our need to
> worry about license compatibility issues.
>
> I thank you for taking the time to look at my video and the discussion,
> and I hope that this response explains well the need to remove licensing
> restrictions from integrating models.
>
>             Jacob
>
>
>
>
> On Tue, May 25, 2021 at 5:18 AM William Waites <wwaites at ieee.org> wrote:
>
>> Dear Jacob,
>>
>> I did watch your video and understand what you are saying. I’m also
>> pretty well-informed about licenses and patents as they relate to software
>> and data having been engaged with that topic in different countries (i.e.
>> different legal contexts) since the mid-1990s.
>>
>> There are several problems with your analysis.
>>
>> 1. It is perfectly well possible to compose together software with
>> different licenses. We do this all the time, and very successfully. We
>> would not have Linux distributions if this were not possible, and most of
>> the large programs written in Python or Java or whatever with a ton of
>> libraries that we use for scientific computing would not exist. Different
>> communities have different cultural ideas about which kinds of licenses
>> they prefer. Broadly, there are BSD/MIT style licenses that some like that
>> basically only require attribution, and there are copyleft GPL style
>> licenses that others like that additionally require derived work to also be
>> free. This is, to a very large extent, a solved problem. As I say, most of
>> modern computing would not be possible if we hadn’t already solved this.
>>
>> 2. Abandoned code is not a problem if it is properly licensed in the
>> first place. You are perfectly free to take any GPL or MIT or BSD licensed
>> software that has been abandoned and continue to use it and develop it.
>> Nothing stops you. Nothing at all. You are not free to change its license
>> without the involvement of the original authors, but why should you want to?
>>
>> The claim that we can’t make software out of pieces with different
>> licenses is demonstrably false.
>>
>> The claim that we can’t use software abandoned by the original authors is
>> also false.
>>
>> It is perfectly fine to use CC0. As I said, in the USA that is equivalent
>> to putting the software in the public domain. Not every country has the
>> concept of public domain in the same sense, so CC0 is designed to emulate
>> it in those cases. This is unusual, most people do not do this because they
>> require attribution at the very least. Attribution is the norm in
>> scientific work so it seems like public domain/CC0 is not really the best
>> match to established practice.
>>
>> I understand very well what you are doing with patents and you have been
>> nothing but up front about it. I understand very well what patents are and
>> how they work. I still think it’s a bad idea to propose using patents for
>> scientific models. It’s also a pretty fringe idea. I often like fringe
>> ideas but I don’t like this one.
>>
>> It is possible to get into trouble if you try to use code released under
>> a GPL-style copyleft license with something proprietary. This is by design,
>> it is not by accident or ignorance. If we want to discourage this (I don’t,
>> personally) then we can recommend the more liberal MIT/BSD style of license.
>>
>> It is a very big problem when people release code with no license at all.
>> That means we can’t do anything with it at all. I suggest that we drop the
>> discussion about patents and simply say that it is important that model
>> code is released under some license. The OSI maintains a decent list of
>> appropriate licenses: https://opensource.org/licenses
>>
>> Best wishes,
>> -w
>>
>> > On 24 May 2021, at 19:06, Jacob Barhak <jacob.barhak at gmail.com> wrote:
>> >
>> > Thanks William,
>> >
>> > A good debate is reasonable regarding licensing. So it is welcome.
>> >
>> > I can write a lot about it and in fact I have been having this
>> conversation on several channels.
>> >
>> > There are many forms of restrictions on what you can do. Even open
>> source licenses are despite their name are based on copyright law which is
>> a form of legal restriction. Both copyright and patents are forms of legal
>> restrictions. And if you want a comparison and a longer discussion, I
>> suggest you look at the table the presentation I made for COMBINE last year:
>> >       • Jacob. Barhak, Open Source and Sustainability, COMBINE 2020
>> October 5-9. Video:
>> https://drive.google.com/drive/folders/1actGnx6FwvoCcPrrF3qbnO0AmHt10WN6
>> starting from minute 13:10. Presentation:
>> https://jacob-barhak.github.io/COMBINE2020_OpenSource_upload_2020_10_04.odp
>> >
>> > Many people are unnecessarily worried about patents. I assume many
>> times without understanding the details. I repeat again my conflict of
>> interest, since I do hold patents. So I may be biased in your mind, yet
>> please do check out my arguments in the presentation.
>> >
>> > Note that just like software licenses are not always compatible with
>> each other, patents are not always compatible with some licenses and with
>> intentions of all parties involved - this is many times the source for
>> misunderstanding. Many restrictions are orthogonal to each other and need
>> to be cleared before use.. In many cases, some work may need multiple
>> licenses and permissions so you can use it.  It depends on many factors,
>> including jurisdictions, time, etc.
>> >
>> > Specifically for CC0 - CC0 is the most unrestricting license I am aware
>> of since it waives copyright and therefore highly compatible with many
>> others - this is why it was mentioned as a good solution and indeed it has
>> been widely adopted . Moreover, it resolves issues of abandoned software or
>> with software where multiple contributors cannot agree on. So it gives life
>> to code and provides incentives to improve progress.
>> >
>> > If I am about to integrate a new model or a new work, I may be
>> restricted by many restrictions, and those are coming from potentially
>> multiple sources, especially if I am integrating multiple models. So
>> eliminating copyright and making things compatible helps a lot. It may not
>> be sufficient since there are still orthogonal restrictions, yet it's a
>> good start. This is why it was recommended and indeed more and more
>> entities are using CC0 to release work or to accumulate it in a repository.
>> >
>> > You mentioned CC licenses family - yes, those are nice licenses, yet
>> some still hold restrictions and are not even compatible with each other.
>> Here is the compatibility chart within CC license family:
>> > https://wiki.creativecommons.org/wiki/Wiki/cc_license_compatibility
>> >
>> > And yes, in some cases for some entities some licenses will not match
>> their intentions - it depends on the situation - yet if you have to bridge
>> many intentions, it's a good idea to remove as many restrictions as
>> possible.
>> >
>> > Hopefully you find these explanations sufficient for now.
>> >
>> >               Jacob
>> >
>> >
>> >
>> > On Mon, May 24, 2021 at 8:19 AM William Waites <wwaites at ieee.org>
>> wrote:
>> > I am hesitant to get involved in this particular aspect of the paper
>> and have long since timed out on software licensing discussions. However…
>> >
>> > The point that there are inconsistent licenses (or even absent licenses
>> which is legally the most restricted since that defaults to “all rights
>> reserved” essentially) and this can cause problems when assembling
>> composite models is accurate and fair. This is a challenge that we need to
>> address. We want to maximise the impact of the public funding of much of
>> the kind of work that we do, which means that others need to be as free as
>> possible to reuse our work.
>> >
>> > It is debatable whether CC0 is appropriate. It is meant to emulate the
>> public domain in places that do not have a legal concept of public domain.
>> It does not require attribution, which is the normal standard for academic
>> work. The other CC licenses that require attribution are not designed for
>> software. Insisting on using the public domain for software and then
>> asserting the ability to control use using patents is a novel idea, but I
>> don’t think it is a very good one. It is also not possible in many
>> jurisdictions that do not allow software patents.
>> >
>> > Standards bodies also typically have patent policies which range from
>> “disclose your patents” to “if you contribute patented stuff you must agree
>> to never try to enforce it”. We can reasonably expect that if we produce
>> patent-encumbered standards, nobody will use them. From a standards
>> development point of view, this needs addressed as well.
>> >
>> > There is also a ton of well-developed literature on free and open
>> source software licensing and compatibility among licenses.
>> >
>> > Best wishes,
>> > -w
>> >
>> >
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.simtk.org/pipermail/vp-reproduce-subgroup/attachments/20211230/a58e473a/attachment-0001.html>


More information about the Vp-reproduce-subgroup mailing list