Yea there were several attempts (including ar5iv), and distill.pub is no longer active + Semantic Scholar is PDF-based.
None quite made the full use of HTML or have a robust conversion system. Jeff Dean's post is awesome - though using Gemini 3 is compute intensive and may still hallucinate in the end (I'm using a source-based latex to json parser). And the output is still...not very interactive.
Just passing by to mention that if you get excited about seeing your upgrades in arXiv itself, we can talk about contributing them to the arXiv HTML pages.
But seeing your plans for Science Stack, all the best with the endeavour!
And I am curious to know if arXiv:2105.10386 works well.
It works! After the initial data load (big paper), the scrolling and performance works nicely.
Can visit at sciencestack.ai/arxiv/2105.10386
Note: no support for nomenclature/index yet.
I'm also working on refactoring the data/json to a streaming model (right now it's one big json dump on load)
Cool project, the space is very crowded: https://x.com/JeffDean/status/1991053401061536027 and http://semanticscholar.org/ come to mind
Yea there were several attempts (including ar5iv), and distill.pub is no longer active + Semantic Scholar is PDF-based. None quite made the full use of HTML or have a robust conversion system. Jeff Dean's post is awesome - though using Gemini 3 is compute intensive and may still hallucinate in the end (I'm using a source-based latex to json parser). And the output is still...not very interactive.
Just passing by to mention that if you get excited about seeing your upgrades in arXiv itself, we can talk about contributing them to the arXiv HTML pages.
But seeing your plans for Science Stack, all the best with the endeavour!
And I am curious to know if arXiv:2105.10386 works well.
It works! After the initial data load (big paper), the scrolling and performance works nicely.
Can visit at sciencestack.ai/arxiv/2105.10386
Note: no support for nomenclature/index yet. I'm also working on refactoring the data/json to a streaming model (right now it's one big json dump on load)
Why isn't there a dependency graph for the example paper "Sheaf Cohomology of Linear Predictive Coding Networks"?
Currently the dependency graph only works on newtheorems e.g. theorems/lemmas/definitions etc
https://www.sciencestack.ai/docs/dep-graph
I have a question, can it convert any pdf paper to this format? or I need the latex of that paper?
Works with any latex source
https://www.sciencestack.ai/docs/faq