Files
rollup/CLAUDE.md

1.9 KiB

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Build and Run Commands

# Build the binary
go build -o rollup .

# Run directly
go run main.go [command]

# Run tests
go test ./...

# Run a single test
go test -run TestName ./path/to/package

Project Overview

Rollup is a Go CLI tool that aggregates text-based files and webpages into markdown files. It has three main commands:

  • files - Rolls up local files into a single markdown file
  • web - Scrapes webpages and converts to markdown using Playwright
  • generate - Creates a default rollup.yml config file

Architecture

Entry Point: main.go initializes Playwright browser and loads config before executing commands via Cobra.

Command Layer (cmd/):

  • root.go - Cobra root command with global flags (--config, --verbose)
  • files.go - File aggregation with glob pattern matching for ignore/codegen detection
  • web.go - Web scraping orchestration, converts config site definitions to scraper configs
  • generate.go - Scans directory for text file types and generates rollup.yml

Internal Packages:

  • internal/config - YAML config loading and validation. Defines Config, SiteConfig, PathOverride structs
  • internal/scraper - Playwright-based web scraping with rate limiting, HTML-to-markdown conversion via goquery and html-to-markdown library

Key Dependencies:

  • spf13/cobra - CLI framework
  • playwright-go - Browser automation for web scraping
  • PuerkitoBio/goquery - HTML parsing and CSS selector extraction
  • JohannesKaufmann/html-to-markdown - HTML to markdown conversion

Configuration

The tool reads from rollup.yml by default. Key config fields:

  • file_extensions - File types to include in rollup
  • ignore_paths / code_generated_paths - Glob patterns for exclusion
  • sites - Web scraping targets with CSS selectors, path filtering, rate limiting