I built a tool to detect almost any object in images using a prompt

Name: I built a tool to detect almost any object in images using a prompt
Availability: InStock
Author: eyasu6464

by eyasu6464·Mar 11, 2026·1 point·0 comments

Visit Project View on HN

AI Analysis

●MidSlickShip It

Open-vocabulary object detection exporting to YOLO format without login requirements.

Strengths

•Exports detections as JSON or CSV for immediate YOLO model training.
•Processes images in memory without storing user data on servers.

Weaknesses

•Wraps existing vision-language models without novel architectural improvements or differentiation.

Post Description

I built a tool that detects objects in images by describing them in text.

Instead of training a detector for a fixed set of classes, you can type what you want to find and the system returns bounding boxes for matching objects.

Examples of prompts: "dented car bumper" "person wearing red backpack" "cat scratching couch" "broken window"

The model is open-vocabulary, so it can generalize beyond predefined categories.

I originally built this while experimenting with ways to bootstrap datasets for training YOLO models without manually labeling thousands of images. The detections can be exported as labels and used to train a traditional detector.

There is a demo where you can upload an image and try different prompts.

Curious where people think prompt-based detection is actually useful in real workflows (dataset labeling, QA inspection, etc.).

Similar Projects

AI/ML●Mid

Satellite imagery object detection using text prompts

VLM-based satellite detection sounds good until you remember YOLO and specialized models handle occlusion better.

Ship ItEye Candy

eyasu6464

53233mo ago

Design●Mid

I built tool that automatically inserts text behind any object on image

AI object detection for text placement, but Canva and Adobe Express already do this better.

Eye Candy

Sayyidalijufri

113mo ago

AI/ML●●Solid

Detect any object in satellite imagery using a text prompt

Tile-based VLM inference with coordinate projection, but dense objects still need YOLO.

Big BrainNiche Gem

eyasu6464

2273mo ago

Other○Pass

Retina – An Object Detection Library in Python

Broken link to a Medium article means there is no project to review.

sumanpal7

101mo ago

AI/ML●Mid

Trained YOLOX from scratch to avoid Ultralytics (aircraft detection)

The author documents ripping out Ultralytics and training YOLOX end-to-end on an aircraft dataset, releasing code under an MIT license so you can run and modify the whole pipeline yourself. This is the sort of no-frills, reproducible recipe that saves time if you need full control over configs, checkpoints and licensing — not novel research, but genuinely useful for people who hit the limits of packaged repos.

Niche GemSolve My Problem

auspiv

214mo ago

Developer Tools●Mid

Turn server photos into editable rack templates (experimental)

CV-based port detection exists, but privacy warnings kill enterprise adoption.

Niche GemShip It

matt-p

106h ago