Seriously though this is an impressive result, "beating" gpt3.5 is a huge milestone and I love that we're continuing the trend. Will need to try out a quant of this to see how it does in real world usage. Hope it gets added to the lmsys arena!
I don't get your question. I think their contribution isn't training a model from zero, but a new DPO loss function for fine-tuning. You can read about that in their paper. It is open-access. The model itself is a fine-tune of MoMo-72B-lora-1.8.7-DPO which is based on Qwen-72B. Respective models have their own papers and Github repos. If your question is about the dataset, that is answered in Appendix D of the paper.
(This is the repo they link with the statement "We release our code and pretrained models [...]". I can't find a ready-made Python script there (yet). But their method and contribution to DPO seem to be described in the paper. Everything looks pretty open to me. They even described their dataset. But it's a scientific paper with a small improvement to fine-tuning, accompanied with a model to show off the statistics... Not a software release.)
It is awesome to have such models opensourced and competed with chatgpt4 but main feature why people still like closed source chatgpt is access to internet for such models. Is there any model have it now?