You are not logged in. Your edit will be placed in a queue until it is peer reviewed.
We welcome edits that make the post easier to understand and more valuable for readers. Because community members review edits, please try to make the post substantially better than how you found it, for example, by fixing grammar or adding additional resources and hyperlinks.
-
11Can you link or quote where the partnership violates the license?– rene ModCommented May 8, 2024 at 22:29
-
8@rene See stackoverflow.com/help/licensing CC BY-SA requires attribution. LLMs and attribution are fundamentally incompatible, and they clearly have no idea what they are talking about when they pretend that they can give attribution. OpenAI would rather ask for forgiveness than permission, and SE cares little about licenses.– GantendoCommented May 9, 2024 at 14:19
-
10@Gantendo the content is dual licensed. SE also has a license to our content and that license doesn't require attribution. I'm not arguing whether I agree with all this shenanigans and whether I trust one tech company over another, I'm just trying to get it straight whether this violates anything. I don't believe it does so this argument is not going to win it. We need to come up with a better one, potential one that will survive in court.– rene ModCommented May 9, 2024 at 15:05
-
5@rene but OpenAI used data from SE before there was any partnership/agreement between SE and OpenAI. They operate on a "better to ask forgiveness than permission"-model. They admitted to using the OpenCrawl data, and stackoverflow is one of the domains in the dataset. commoncrawl.org/blog/…– GantendoCommented May 9, 2024 at 15:59
-
3But that OpenAI used your content is a problem between you and OpenAI, not something SE can or need to fix. What SE can do is use their license of your data to get reimbursed for use of the body of knowledge by OpenAI going forward so both SE and its communities gets somewhat compensated: SE in money, the community by having more and better features on the public platform as a result of that.– rene ModCommented May 9, 2024 at 16:30
-
6@rene - Been over this every other time this comes up. Proper attribution is required in order for the license to be honored, which GenAI doesn't do. It is a clear violation. There are numerous lawsuits which are about to become legal landmark cases, and this legal basis will be used here against Stack Exchange should it come to that.– Travis JCommented May 9, 2024 at 18:02
-
doesn't a post getting edited change the license on the latest rev to the most recent content license? I.e. is it not last activity date instead of creation date that matters?– starballCommented May 14, 2024 at 20:50
-
@super-starball-ultra - Only if there was new content added, would the new content itself be within the current ToS contract. Revision date would be relevant to the content contract. For example, changing one character at the end of a long post would not then make that post abide by the newest ToS contract; just as some employee changing everyone's last activity date would not change the revision dates.– Travis JCommented May 14, 2024 at 21:48
-
The consensus on generative AI is that the use of the training data is transformative, thus constitute fair use in the US: arl.org/blog/…. Even creative commons themselves consider it to be fair use: creativecommons.org/2023/02/17/fair-use-training-generative-ai.– PoscatCommented May 15, 2024 at 9:54
-
1@Poscat - Problem there is that in edge cases, which is an abundance of Stack Overflow, AI are not trained with the depth that would be desirable. Quite the opposite, and as a result, frequent verbatim reproduction occurs, especially when it comes to code. Training is rather benign so long as it is never used to generate. However, when generation occurs and that generation contains verbatim reproduction, then training does come in question with regards to sourcing. If it was trained (sourced) on material which was licensed and is later plagiarized, then it will have harmed that author.– Travis JCommented May 15, 2024 at 19:28
Add a comment
|
How to Edit
- Correct minor typos or mistakes
- Clarify meaning without changing it
- Add related resources or links
- Always respect the author’s intent
- Don’t use edits to reply to the author
How to Format
-
create code fences with backticks ` or tildes ~
```
like so
``` -
add language identifier to highlight code
```python
def function(foo):
print(foo)
``` - put returns between paragraphs
- for linebreak add 2 spaces at end
- _italic_ or **bold**
- indent code by 4 spaces
- backtick escapes
`like _so_`
- quote by placing > at start of line
- to make links (use https whenever possible)
<https://example.com>[example](https://example.com)<a href="https://example.com">example</a>
How to Tag
A tag is a keyword or label that categorizes your question with other, similar questions. Choose one or more (up to 5) tags that will help answerers to find and interpret your question.
- complete the sentence: my question is about...
- use tags that describe things or concepts that are essential, not incidental to your question
- favor using existing popular tags
- read the descriptions that appear below the tag
If your question is primarily about a topic for which you can't find a tag:
- combine multiple words into single-words with hyphens (e.g. stack-overflow), up to a maximum of 35 characters
- creating new tags is a privilege; if you can't yet create a tag you need, then post this question without it, then ask the community to create it for you