Molmo 2, a new suite of state-of-the-art vision-language models with open weights, training data, and training code, can analyze videos and multiple images at once.
Ai2 is back with a massive upgrade. If you liked the original Molmo for images, you are going to love this. Molmo 2 brings that same "pointing" capability to video.
The coolest part is how it handles Space + Time. You don't just get a text summary, you get exact timestamps and coordinates. Ask it "how many times did the ball hit the ground?" and it points to every single instance.
It reportedly outperforms Gemini 3 Pro in video tracking🤯, all while being trained on less than 1/8th of the data Meta used for PerceptionLM. That is some serious efficiency.
What’s impressive here isn’t just the benchmark gains, but the form of the output. Grounding language to precise space, time coordinates is what turns video understanding from analysis into something actionable. Open weights plus this level of temporal and spatial fidelity feels like a real step toward usable perception systems, not just better demos.
Congrats on the launch! Objaverse is an impressive contribution to the 3D and AI ecosystem. The scale and visual diversity of 800k+ annotated objects really stand out compared to existing repositories. This feels especially valuable for advancing embodied AI, robotics, and 3D generation research. Curious how you see the community contributing back or extending the dataset over time.
About Molmo 2 on Product Hunt
“SOTA video understanding, pointing, and tracking VLM”
Molmo 2 launched on Product Hunt on December 29th, 2025 and earned 102 upvotes and 5 comments, placing #9 on the daily leaderboard. Molmo 2, a new suite of state-of-the-art vision-language models with open weights, training data, and training code, can analyze videos and multiple images at once.
Molmo 2 was featured in Open Source (68.4k followers) and Artificial Intelligence (469k followers) on Product Hunt. Together, these topics include over 106.1k products, making this a competitive space to launch in.
Who hunted Molmo 2?
Molmo 2 was hunted by Zac Zuo. A “hunter” on Product Hunt is the community member who submits a product to the platform — uploading the images, the link, and tagging the makers behind it. Hunters typically write the first comment explaining why a product is worth attention, and their followers are notified the moment they post. Around 79% of featured launches on Product Hunt are self-hunted by their makers, but a well-known hunter still acts as a signal of quality to the rest of the community. See the full all-time top hunters leaderboard to discover who is shaping the Product Hunt ecosystem.
Want to see how Molmo 2 stacked up against nearby launches in real time? Check out the live launch dashboard for upvote speed charts, proximity comparisons, and more analytics.
Hi everyone!
Ai2 is back with a massive upgrade. If you liked the original Molmo for images, you are going to love this. Molmo 2 brings that same "pointing" capability to video.
The coolest part is how it handles Space + Time. You don't just get a text summary, you get exact timestamps and coordinates. Ask it "how many times did the ball hit the ground?" and it points to every single instance.
It reportedly outperforms Gemini 3 Pro in video tracking🤯, all while being trained on less than 1/8th of the data Meta used for PerceptionLM. That is some serious efficiency.