Harvard and Google to release 1 million public-domain books as AI training dataset

AI training data has a big price tag, one best-suited for deep-pocketed tech firms. This is why Harvard University plans to release a dataset that includes in the region of 1 million public-domain books, spanning genres, languages, and authors including Dickens, Dante, and Shakespeare, which are no longer copyright-protected due to their age. The new […] © 2024 TechCrunch. All rights reserved. For personal use only.

Dec 12, 2024 - 22:00
Harvard and Google to release 1 million public-domain books as AI training dataset

AI training data has a big price tag, one best-suited for deep-pocketed tech firms. This is why Harvard University plans to release a dataset that includes in the region of 1 million public-domain books, spanning genres, languages, and authors including Dickens, Dante, and Shakespeare, which are no longer copyright-protected due to their age. The new […]

© 2024 TechCrunch. All rights reserved. For personal use only.

What's Your Reaction?

like

dislike

love

funny

angry

sad

wow