AI Chat Exporter
A Python CLI tool that converts saved HTML chat logs from ChatGPT, Gemini, Claude, Copilot, and DeepSeek into structured Markdown notes.
AI Chat Exporter
v3.2.0 — Convert AI chat HTML exports into clean, tagged Markdown notes.
A modern Python CLI tool that converts saved HTML chat logs from ChatGPT, Gemini, Claude, Copilot, and DeepSeek into structured Markdown files — ready for Obsidian, Notion, or any knowledge base.
Features
| Feature | Description |
|---|---|
| Live Watch Mode | Auto-detects new HTML files in your Downloads folder via watchdog |
| Full-Page Export | Convert an entire HTML chat page to Markdown — no search needed |
| Platform Cleanup | Auto-strips sidebars, branding, overlays, and input areas |
| Content-Safe Cleanup | Text-length guard prevents accidental removal of large content containers |
| Gemini DOM Support | Handles Gemini’s custom Angular web components |
| User-Code Dedup | Removes code blocks from user messages so only the AI’s code appears |
| Batch Processing | Process every HTML file in a directory at once |
| CLI + Interactive | Full argparse CLI flags or guided interactive menu |
| Smart Extraction | Finds specific AI responses by search phrase |
| AI Smart Titles | Generates clean headings from verbose questions (AI or heuristic) |
| Session Merging | Append multiple extractions into a single “Master Note” |
| 15+ Language Detection | Python, C++, JS, TS, Rust, Go, Java, SQL, Bash, Ruby, C#, Kotlin, Swift, and more |
| 3-Tier Detection | HTML class → proximity search → syntax analysis |
| YAML Frontmatter | Auto-generated tags, date, source for Obsidian compatibility |
| Zero Config Start | Works out of the box — config.json is optional |
Quick Start
1. Clone & Install
git clone https://github.com/calculusphile/AI-Chat-Exporter.git
cd AI-Chat-Exporter
pip install -r requirements.txt
2. Run (Interactive)
python watcher.py
3. Run (CLI)
# Live watch mode
python watcher.py --watch
# Process a single file
python watcher.py --file "path/to/chat.html"
# Full-page export (entire HTML → Markdown)
python watcher.py --file "chat.html" --full-page
# Batch process a folder
python watcher.py --batch "path/to/folder"
# Merge all extractions into one file
python watcher.py --file "chat.html" --merge "StudyNotes.md"
CLI Reference
| Flag | Short | Description |
|---|---|---|
--version |
-v |
Print version and exit |
--watch |
-w |
Start live-watch mode on Downloads folder |
--file PATH |
-f |
Process a single HTML file |
--batch PATH |
-b |
Process all HTML files in a directory |
--full-page |
-p |
Export entire page instead of searching for phrases |
--merge NAME |
-m |
Merge all extractions into one .md file |
--downloads PATH |
Override the watched Downloads directory | |
--debug |
Enable verbose debug logging |
How It Works
- Input — Save any AI chat page as
.html(Ctrl+S in browser) - Detection — The watcher picks it up, or you pass it via
--file - Mode — Choose between search-based extraction or full-page export
- Smart Titles — Verbose questions are auto-cleaned into concise headings (AI or heuristic)
- Language Detection — Code blocks are analyzed with a 3-tier strategy:
- HTML class attributes (
language-python) - Proximity search (nearest text label above the block)
- Syntax pattern matching (regex on code content)
- HTML class attributes (
- Output — Clean Markdown with YAML frontmatter, auto-tags, and proper code fences
Project Structure
AI_Chat_Exporter/
├── watcher.py # CLI entry point + file watcher
├── converter.py # HTML → Markdown conversion engine
├── config_loader.py # Typed config management
├── title_generator.py # AI + heuristic smart title generation
├── logger.py # Centralised logging
├── config.json # User settings
├── requirements.txt # Dependencies
├── ARCHITECTURE.md # Developer guide & change-impact map
├── README.md
├── LICENSE
└── Exported_Notes/ # Output (git-ignored)
Tech Stack
- Python 3 — Core language
- BeautifulSoup — HTML parsing & DOM manipulation
- Watchdog — File system event monitoring
- OpenAI API (optional) — AI-powered title generation