mirror of
https://github.com/tnypxl/rollup.git
synced 2025-12-12 22:23:16 +00:00
1.9 KiB
1.9 KiB
CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Build and Run Commands
# Build the binary
go build -o rollup .
# Run directly
go run main.go [command]
# Run tests
go test ./...
# Run a single test
go test -run TestName ./path/to/package
Project Overview
Rollup is a Go CLI tool that aggregates text-based files and webpages into markdown files. It has three main commands:
files- Rolls up local files into a single markdown fileweb- Scrapes webpages and converts to markdown using Playwrightgenerate- Creates a default rollup.yml config file
Architecture
Entry Point: main.go initializes Playwright browser and loads config before executing commands via Cobra.
Command Layer (cmd/):
root.go- Cobra root command with global flags (--config, --verbose)files.go- File aggregation with glob pattern matching for ignore/codegen detectionweb.go- Web scraping orchestration, converts config site definitions to scraper configsgenerate.go- Scans directory for text file types and generates rollup.yml
Internal Packages:
internal/config- YAML config loading and validation. DefinesConfig,SiteConfig,PathOverridestructsinternal/scraper- Playwright-based web scraping with rate limiting, HTML-to-markdown conversion via goquery and html-to-markdown library
Key Dependencies:
spf13/cobra- CLI frameworkplaywright-go- Browser automation for web scrapingPuerkitoBio/goquery- HTML parsing and CSS selector extractionJohannesKaufmann/html-to-markdown- HTML to markdown conversion
Configuration
The tool reads from rollup.yml by default. Key config fields:
file_extensions- File types to include in rollupignore_paths/code_generated_paths- Glob patterns for exclusionsites- Web scraping targets with CSS selectors, path filtering, rate limiting