Child abuse images removed from AI image-generator training source, researchers say

girlfreddy@lemmy.ca · 4 months ago

Child abuse images removed from AI image-generator training source, researchers say

Flying Squid@lemmy.world · 4 months ago

I’m glad they removed them, but it’s kind of closing the barn doors after the horses have bolted at this point.

Iapar@feddit.org · 4 months ago

Complete failure of everyone involved that it was in there in the first place.

istanbullu@lemmy.ml · 4 months ago

These datasets have billions of images in them (The Laion database have 5 billion images!). There is no way a human can go through them to check for bad content.

Iapar@feddit.org · 4 months ago

Then don’t just use it? Or use a program? There a multiple ways to not do something stupid and none of them occurred to them because it is more important to them to be at the top of the shitpile.

istanbullu@lemmy.ml · 4 months ago

The dataset sizes needed for machine learning rule out any kind of human verification. It’s just not possible to manually check billions of images.

Iapar@feddit.org · 4 months ago

Oh, that makes it okay then.

istanbullu@lemmy.ml · 4 months ago

How would you check 5 billion images?

Iapar@feddit.org · 4 months ago

Mu.

I wouldn’t use a amount of images I couldn’t check. I wouldn’t use images from unchecked sources. I wouldn’t make money from sexual exploited children.

And I think people that don’t see the most obvious solution to that are fucked in the head.

istanbullu@lemmy.ml · 4 months ago

That won’t work. Models of this kind need billions of images or they are trash.

vrek@programming.dev · 4 months ago

Great they removed them… Did they report the images to the authorities?

RecallMadness@lemmy.nz · 4 months ago

If 2000 out of 5,000,000,000 images can be found, why couldn’t they be found before the dataset was published.

girlfreddy@lemmy.ca · 4 months ago

That’s a question to be pondered for the ages.

/s