Edit - Meta Stack Exchange

You are not logged in. Your edit will be placed in a queue until it is peer reviewed.

We welcome edits that make the post easier to understand and more valuable for readers. Because community members review edits, please try to make the post substantially better than how you found it, for example, by fixing grammar or adding additional resources and hyperlinks.

Required fields*

Rev

11

Can you link or quote where the partnership violates the license?
– rene Mod
Commented May 8, 2024 at 22:29
8

@rene See stackoverflow.com/help/licensing CC BY-SA requires attribution. LLMs and attribution are fundamentally incompatible, and they clearly have no idea what they are talking about when they pretend that they can give attribution. OpenAI would rather ask for forgiveness than permission, and SE cares little about licenses.
– Gantendo
Commented May 9, 2024 at 14:19
10

@Gantendo the content is dual licensed. SE also has a license to our content and that license doesn't require attribution. I'm not arguing whether I agree with all this shenanigans and whether I trust one tech company over another, I'm just trying to get it straight whether this violates anything. I don't believe it does so this argument is not going to win it. We need to come up with a better one, potential one that will survive in court.
– rene Mod
Commented May 9, 2024 at 15:05
5

@rene but OpenAI used data from SE before there was any partnership/agreement between SE and OpenAI. They operate on a "better to ask forgiveness than permission"-model. They admitted to using the OpenCrawl data, and stackoverflow is one of the domains in the dataset. commoncrawl.org/blog/…
– Gantendo
Commented May 9, 2024 at 15:59
3

But that OpenAI used your content is a problem between you and OpenAI, not something SE can or need to fix. What SE can do is use their license of your data to get reimbursed for use of the body of knowledge by OpenAI going forward so both SE and its communities gets somewhat compensated: SE in money, the community by having more and better features on the public platform as a result of that.
– rene Mod
Commented May 9, 2024 at 16:30
6

@rene - Been over this every other time this comes up. Proper attribution is required in order for the license to be honored, which GenAI doesn't do. It is a clear violation. There are numerous lawsuits which are about to become legal landmark cases, and this legal basis will be used here against Stack Exchange should it come to that.
– Travis J
Commented May 9, 2024 at 18:02
doesn't a post getting edited change the license on the latest rev to the most recent content license? I.e. is it not last activity date instead of creation date that matters?
– starball
Commented May 14, 2024 at 20:50
@super-starball-ultra - Only if there was new content added, would the new content itself be within the current ToS contract. Revision date would be relevant to the content contract. For example, changing one character at the end of a long post would not then make that post abide by the newest ToS contract; just as some employee changing everyone's last activity date would not change the revision dates.
– Travis J
Commented May 14, 2024 at 21:48
The consensus on generative AI is that the use of the training data is transformative, thus constitute fair use in the US: arl.org/blog/…. Even creative commons themselves consider it to be fair use: creativecommons.org/2023/02/17/fair-use-training-generative-ai.
– Poscat
Commented May 15, 2024 at 9:54
1

@Poscat - Problem there is that in edge cases, which is an abundance of Stack Overflow, AI are not trained with the depth that would be desirable. Quite the opposite, and as a result, frequent verbatim reproduction occurs, especially when it comes to code. Training is rather benign so long as it is never used to generate. However, when generation occurs and that generation contains verbatim reproduction, then training does come in question with regards to sourcing. If it was trained (sourced) on material which was licensed and is later plagiarized, then it will have harmed that author.
– Travis J
Commented May 15, 2024 at 19:28

Add a comment |

Correct minor typos or mistakes
Clarify meaning without changing it
Add related resources or links
Always respect the author’s intent
Don’t use edits to reply to the author

create code fences with backticks ` or tildes ~
```
like so
```
add language identifier to highlight code
```python
def function(foo):
print(foo)
```
put returns between paragraphs
for linebreak add 2 spaces at end
_italic_ or **bold**
indent code by 4 spaces
backtick escapes `like _so_`
quote by placing > at start of line
to make links (use https whenever possible)

<https://example.com>

[example](https://example.com)

<a href="https://example.com">example</a>

formatting help »
answering help »