Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
lllllm
9 months ago
|
parent
|
context
|
favorite
| on:
ETH Zurich and EPFL to release a LLM developed on ...
Yes this is an interesting question. In our arxiv paper [1] we did study this for news articles, and also removed duplicates of articles (decontamination). We did not observe an impact on the downstream accuracy of the LLM, in the case of news data.
[1]
https://arxiv.org/abs/2504.06219
Consider applying for YC's Summer 2026 batch! Applications are open till May 4
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search:
[1] https://arxiv.org/abs/2504.06219