Kimi K1.5: Scaling Reinforcement Learning with LLMs

200 points | by noch 6 days ago

34 comments

NitpickLawyer 6 days ago
Really unfortunate timing with Deepseek-R1 and the distills coming out at basically the same time. Hard for people to pay attention to, and plus open source > API, even if the results are a bit lower.
[-]
- miohtama 5 days ago
  DeepSeek is not open source, as the source of its 14.8T high quality training tokens is not disclosed.
  [-]
  - arjun_krishna1 5 days ago
    At some point this becomes No True Scotsman. They have disclosed a lot
  - NitpickLawyer 5 days ago
    Yawn. This has been debated ad nauseam. If you want to feel that way it's up to you, but disclosing means and hidden information has never been a requirement for open source. As long as the thing is licensed under a permissive license (MIT in this case), and you can see the data, change the data and re-publish the data, it's open source.
    [-]
    - 5 days ago
      [deleted]
zurfer 6 days ago
Is it fair to say that 2 of the 3 leading models are from Chinese labs? It's really incredible how fast China has caught up.
[-]
- idiotsecant 6 days ago
  Its not all that surprising that the country with 20% of the population of earth has some smart people in it. What is, I think, surprising and fascinating is how China has been focusing on doing more with less - their underdog position w.r.t. hardware has pushed a huge focus on model efficiency and distillation, to the benefit of us all.
  I think its a distinct possibility that while the first AGI to say 'hello world' might do it in english, the first open source AGI running on consumer hardware will probably say it in mandarin.
  [-]
  - whimsicalism 6 days ago
    > Its not all that surprising that the country with 20% of the population of earth has some smart people in it.
    where’s india's reasoning models? what about entire continent of africa? i’d be curious if they even have a single h100 on the continent
    [-]
    - m00x 4 days ago
      India's smartest people moved to the US
    - airstrike 5 days ago
      I guess central planning can be very effective when playing catch up on greenfield projects of massive scale
      [-]
      - eunos 5 days ago
        Well DeepSeek hasn't really caught the eye of the leadership before they released R1.
      - MacsHeadroom 5 days ago
        DeepSeek is a scrapy tech startup fully funded by a successful hedge fund owner without any cooperation with the government.
        Meanwhile OpenAI's board includes the most recent head of the NSA and Altman just announced he'll be in DC pandering to Trump next week.
asah 6 days ago
The set of math/logic problems behind AIME 2024 appears to be... https://artofproblemsolving.com/wiki/index.php/2024_AIME_I_P...
Impressive stuff! But unclear to me if it's literally just these 15 or if there's a large problem set...
[-]
- codelion 5 days ago
  The full dataset is here - https://huggingface.co/datasets/AI-MO/aimo-validation-aime you can use the eval script I have in optillm to benchmark on it - https://github.com/codelion/optillm/blob/main/scripts/eval_a...
- whimsicalism 6 days ago
  doesn’t seem too hard to me, shame i was never exposed to this stuff in highschool
  e: oh i see, they get progressively harder
joaohkfaria 6 days ago
But wait, which LLM models were used to train Kimi? It wasn't clear on the report.
[-]
- m00x 4 days ago
  wdym? They did their own pretraining.
cuuupid 6 days ago
I really, really dislike when companies use GitHub to promote their product by posting a "research paper" and a code sample.
It's not even an SDK, library, etc., it's just advertising.
I've noticed a number of China-based labs do this; they will often post a really cool demo, some images, and then either an API or just nothing except advertising for their company (e.g. model may not even exist). Often they will also promise in some GitHub issue that they will release the weights, and never do.
I'd love to see some sort of study here, I wonder what % of "omg really cool AI model!!!" hype papers [1] never provide an API, [2] cannot be reproduced at all, and/or [3] promise but never provide weights. If this was any other field, academics would be up in arms about likely fraud, false advertising, etc.
[-]
- diggan 6 days ago
  It's not just Chinese labs that do this, lots of companies upload a README to a GitHub repository then link that repository from the website, I guess so they can have a GitHub icon somewhere on the website?
  Submission is basically a form for requesting access to their closed API (which ironically is called "OpenPlatform" for some reason).
  [-]
  - rfoo 6 days ago
    > which ironically is called "OpenPlatform" for some reason
    This is pretty weird, the original text is 开放平台, but it basically is another name for "API" in China.
    Not sure who started this, but it's really popular, for example, WeChat has an "Open Platform": https://open.weixin.qq.com/. AliPay too: https://open.alipay.com/. And peak strangeness, Alibaba Cloud (whose API is largely an AWS clone): https://open.aliyun.com/
    [-]
    - diggan 6 days ago
      Same thing in English, you have huge enterprises which basically operate on the complete opposite end of the spectrum, and end up calling themselves things like "OpenAI".
      It even bleeds into marketing pages, go to the Llama website and you see "open source model" plastered all over the place, completely misusing both the "open" and "source" parts of it.
    - v3ss0n 6 days ago
      How about OpenAI?
      [-]
      - llm_trw 6 days ago
        You mean the charity* foundation** Open***AI?
- sgtpepper13 6 days ago
  I'm one of the authors of the paper. Thanks for raising a good point. It will be better if we upload the paper to arxiv but it's MLK in the US so submissions will be delayed by a couple of days. And we just can't wait to share some of the knowledge we gained from our experiments. Hope they will be useful for the community. Would much appreciate it if you have an idea about a better site for this. That said our API requests are open and we'll roll out more in the next few days depending on our server resources.
- prjkt 6 days ago
  These types of "repositories" should contain some kind of flag/indication that it contains no source code, similar to when a repo is archived
  [-]
  - whimsicalism 6 days ago
    really? it takes like 1 second looking at the file structure to see what it is, maybe like 2 seconds if you’re hopeful “images” somehow refers to a dockerfile or something
- 5 days ago
  [deleted]
- whimsicalism 6 days ago
  …but they do provide an APi.
  HN is really not beating the bikeshedding allegations
  [-]
  - cuuupid 5 days ago
    They don't it's just promised in the future™. And even then, it should be a webpage on their website or API documentation, not a GitHub repo.
    It's not bikeshedding to expect a source code repository to have source code...
  - ensignavenger 6 days ago
    Do they? I see a note that says it will be "available soon"?
- visarga 6 days ago
  That is unfortunate but they do present some theoretical insights about scaling context length and probably a more efficient way to do RL. Even knowledge about it can have an effect on next iterations from other labs.
- 5 days ago
  [deleted]
abubakkarth 6 days ago
[dead]
beredis 6 days ago
[flagged]