Data Studio – Open-Source Data Notebooks
DuckDB + Pyodide notebooks in-browser, but Jupyter+local and Quarto already do this.

Local AI analyst generates Python notebooks instead of just chat responses, keeping data on your machine.
Data scientists, analysts, researchers
Cursor · Jupyter AI · DataCamp Workspace
I’ve been working on mljar-supervised (open-source AutoML for tabular data) for a few years. Recently I built a desktop app around it called MLJAR Studio.
The idea is simple: you talk to your data in natural language, the AI generates Python code, executes it locally, and the whole conversation becomes a reproducible notebook (*.ipynb file). So instead of just chatting with data, you end up with something you can inspect, modify, and rerun.
What MLJAR Studio does:
- Sets up a local Python environment automatically, runs on Mac, Windows, and Linux
- Installs missing packages during the conversation
- Built-in AutoML for tabular data (classification, regression, multiclass)
- Works with standard Python libraries (pandas, matplotlib, etc.)
- Works with any data file: CSV, Excel, Stata, Parquet ...
- Connects to PostgreSQL, MySQL, SQL Server, Snowflake, Databricks, and Supabase.
For AI: use Ollama locally (zero data egress), bring your own OpenAI key, or use MLJAR AI add-on.
I built this because I wanted something between Jupyter Notebook (flexible but manual) and AI tools that generate code but don’t preserve the workflow. Most tools I tried either hide too much or don’t give reproducible results and are cloud based
Demos:
- 60-second demo: https://youtu.be/BjxpZYRiY4c
- Full 3-minute analysis: https://youtu.be/1DHMMxaNJxI
Pricing is $199 one-time, with a 7-day trial.
Curious if this is useful for others doing real data work, or if I’m solving my own problem here.
Happy to answer questions.
DuckDB + Pyodide notebooks in-browser, but Jupyter+local and Quarto already do this.
Reactive notebook cells that fire on widget interaction beat static Streamlit callbacks.
Polished UI but Jupyter, Observable, and Tableau already do this interactively.
Reactive cell execution is neat, but Streamlit and Gradio already own this space.
Prompt-hashed folders make model comparison easy, but Ollama testing tools already exist.
Filesystem-backed notebooks beat jupytext by making .py files the source of truth.