Mechanizm

Forum Topics Started

Viewing topic 1 (of 1 total)
    • Topic
    • Voices
    • Posts
    • Last Post
    • An ethical (and potentially legal) concern As we’ve recently learned, it is possible - with a relatively simple prompt, to break ChatGPT in such a way that it outputs blocks of seemingly random text. After reproducing this fault myself hundreds of times, and seeing the output of others’ on this forum it became clear that the dataset contains a wide range of text, including many graphic and explicit descriptions of sexuality, criminality and darkness. Violence, murder, rape, bestiality, pedophilia, cannibalism… to name just a few. All of this alongside historical football scores and paragraphs about aircraft, blanket knitting and React code. The question I pose is this: is it moral, or even legal to use or possess (a piece of software/tool) that contains such data/information/text, whether or not it’s designed to be easily accessible? Imagine a DVD movie with encrypted child porn on it. It’s ok, because no one without the key can read it, right? (Wrong) I’ve unsuccessfully attempted to have GPT debug itself, but in the process it provided some background on the model training: Regarding the dataset, OpenAI, the organization behind ChatGPT, trained the model using a large dataset collected from the internet. This dataset consists of a wide range of text sources, including books, websites, and other publicly available documents. It's important to note that the dataset is not curated or filtered by OpenAI, and it is not specifically designed to include or exclude certain types of information. Since the training dataset includes information from various sources, it is possible that some of the text used during training may contain controversial or scandalous content. OpenAI aims to provide a diverse range of information to ensure that the model is knowledgeable about different topics, but it does not necessarily endorse or promote any specific viewpoints. Regarding the legality of the information in the dataset, it's important to understand that OpenAI does not create or endorse the content in the dataset. As an AI language model, I do not have access to the details of how the dataset was collected or any specific legal restrictions that may apply. It is up to the organization responsible for curating and providing the dataset to ensure that it complies with relevant laws and regulations. If you have any specific concerns about the dataset or its content, I recommend reaching out to OpenAI directly for more information. It looks like some of this text could have come from logs or transcripts of private conversations. I completely understand the need for a model to be trained on a large, diverse data set. And I get that it’s not practical to (effectively censor or scrub) this information. I understand that the model is also trained these things are bad, and to provide canned responses saying such. But is not the fact that it contains them at all an issue? In a philosophical sense, it inherently “corrupts” the AI, in that there is no innocence. The AI embodies the darkest of the dark alongside everything else. Unlike humans who are born innocent, and only learn of such things when exposed to them. I think this raises some interesting philosophical considerations when thinking outside the box about AGI, GPT5 and beyond. Hmmm…As we’ve recently learned, it is possible - with a relatively simple prompt, to break ChatGPT in such a way that it outputs blocks of seemingly random text. After reproducing this fault myself hundreds of times, and seeing the output of others’ on this forum it became clear that the dataset contains a wide range of text, including many graphic and explicit descriptions of sexuality, criminality and darkness. Violence, murder, rape, bestiality, pedophilia, cannibalism… to name just a few. All of this alongside historical football scores and paragraphs about aircraft, blanket knitting and React code. The question I pose is this: is it moral, or even legal to use or possess (a piece of software/tool) that contains such data/information/text, whether or not it’s designed to be easily accessible? Imagine a DVD movie with encrypted child porn on it. It’s ok, because no one without the key can read it, right? (Wrong) I’ve unsuccessfully attempted to have GPT debug itself, but in the process it provided some background on the model training: Regarding the dataset, OpenAI, the organization behind ChatGPT, trained the model using a large dataset collected from the internet. This dataset consists of a wide range of text sources, including books, websites, and other publicly available documents. It's important to note that the dataset is not curated or filtered by OpenAI, and it is not specifically designed to include or exclude certain types of information. Since the training dataset includes information from various sources, it is possible that some of the text used during training may contain controversial or scandalous content. OpenAI aims to provide a diverse range of information to ensure that the model is knowledgeable about different topics, but it does not necessarily endorse or promote any specific viewpoints. Regarding the legality of the information in the dataset, it's important to understand that OpenAI does not create or endorse the content in the dataset. As an AI language model, I do not have access to the details of how the dataset was collected or any specific legal restrictions that may apply. It is up to the organization responsible for curating and providing the dataset to ensure that it complies with relevant laws and regulations. If you have any specific concerns about the dataset or its content, I recommend reaching out to OpenAI directly for more information. It looks like some of this text could have come from logs or transcripts of private conversations. I completely understand the need for a model to be trained on a large, diverse data set. And I get that it’s not practical to (effectively censor or scrub) this information. I understand that the model is also trained these things are bad, and to provide canned responses saying such. But is not the fact that it contains them at all an issue? In a philosophical sense, it inherently “corrupts” the AI, in that there is no innocence. The AI embodies the darkest of the dark alongside everything else. Unlike humans who are born innocent, and only learn of such things when exposed to them. I think this raises some interesting philosophical considerations when thinking outside the box about AGI, GPT5 and beyond. Hmmm…

      Started by: Mechanizm in: SuperPrompt Master Mind Group

    • 3
    • 4
    • 2 years, 4 months ago

      Roxanne

Viewing topic 1 (of 1 total)