SOTA video understanding, pointing, and tracking VLM
Molmo 2, a new suite of state-of-the-art vision-language models with open weights, training data, and training code, can analyze videos and multiple images at once.
Ai2 is back with a massive upgrade. If you liked the original Molmo for images, you are going to love this. Molmo 2 brings that same "pointing" capability to video.
The coolest part is how it handles Space + Time. You don't just get a text summary, you get exact timestamps and coordinates. Ask it "how many times did the ball hit the ground?" and it points to every single instance.
It reportedly outperforms Gemini 3 Pro in video tracking🤯, all while being trained on less than 1/8th of the data Meta used for PerceptionLM. That is some serious efficiency.
About Molmo 2 on Product Hunt
“SOTA video understanding, pointing, and tracking VLM”
Molmo 2 launched on Product Hunt on December 29th, 2025 and earned 102 upvotes and 5 comments, placing #9 on the daily leaderboard. Molmo 2, a new suite of state-of-the-art vision-language models with open weights, training data, and training code, can analyze videos and multiple images at once.
On the analytics side, Molmo 2 competes within Open Source and Artificial Intelligence — topics that collectively have 537.4k followers on Product Hunt. The dashboard above tracks how Molmo 2 performed against the three products that launched closest to it on the same day.
Who hunted Molmo 2?
Molmo 2 was hunted by Zac Zuo. A “hunter” on Product Hunt is the community member who submits a product to the platform — uploading the images, the link, and tagging the makers behind it. Hunters typically write the first comment explaining why a product is worth attention, and their followers are notified the moment they post. Around 79% of featured launches on Product Hunt are self-hunted by their makers, but a well-known hunter still acts as a signal of quality to the rest of the community. See the full all-time top hunters leaderboard to discover who is shaping the Product Hunt ecosystem.
For a complete overview of Molmo 2 including community comment highlights and product details, visit the product overview.
Hi everyone!
Ai2 is back with a massive upgrade. If you liked the original Molmo for images, you are going to love this. Molmo 2 brings that same "pointing" capability to video.
The coolest part is how it handles Space + Time. You don't just get a text summary, you get exact timestamps and coordinates. Ask it "how many times did the ball hit the ground?" and it points to every single instance.
It reportedly outperforms Gemini 3 Pro in video tracking🤯, all while being trained on less than 1/8th of the data Meta used for PerceptionLM. That is some serious efficiency.