itishappy 5 months ago

I'm no lawyer, but this repository sure appears to be relicensing the Harry Potter series under the GPL.

gunalx 5 months ago

If all the training data is in the txt files, it is obviously trained on copyrigthed material, and immensly low amounts of text. Im impressed if the outputs even start to make sense at all.

nickpsecurity 5 months ago

Are you the author of the GitHub? If so, I might have a few suggestions.

burgerrito 5 months ago

....is that the whole Harry Potter book in one .txt file, hosted on GitHub!?

  • ClearAndPresent 5 months ago

    That is all the Harry Potter books in one .txt file, hosted on Github.

lostmsu 5 months ago

Large power? 20MW?

cjtrowbridge 5 months ago

Bro use fine web. Random books are not objectively good training data.

parpfish 5 months ago

“Small LLM” means “Small Large Language Model”