Claude
ff13012408
fix: address functionality gaps identified in code review
...
- Wire up --config/-f flag to actually load custom config files
- Move config loading to PersistentPreRunE in root.go
- Simplify main.go to just call cmd.Execute()
- Move Playwright init to web command's PreRunE/PostRunE
- Remove unused functions from cmd/web.go (~90 lines of dead code)
- Remove writeSingleFile, writeMultipleFiles, generateDefaultFilename
- Remove scrapeURL, extractAndConvertContent, testExtractAndConvertContent
- Remove unused mock function from web_test.go
- Add OutputType validation to Config.Validate()
- Only allow "single", "separate", or empty string
- Add test cases for valid and invalid output types
2025-11-27 16:05:42 +00:00
Arik Jones
9341a51d09
fix multi-file output
2024-12-06 17:02:31 -06:00
tnypxl
02e39baf38
flatten scrape config to 'sites:'
...
* flatten scrape config to 'sites:'. Update unit tests and readme.
* remove check for file_extensions configuration.
* show progress indication after 5 seconds.
* add documentation to functions
* fix: remove MaxDepth and link extraction functionality
* fix: Remove MaxDepth references from cmd/web.go
2024-10-14 16:09:58 -05:00
333b9a366c
fix: Resolve playwright function deprecations and io/ioutil function deprecations.
2024-09-24 15:13:36 -05:00
Arik Jones
73116e8d82
Fix logging and other issues from preventing scraping
2024-09-21 15:54:33 -05:00
5482621d99
fix: Use preferred fmt.Fprintf funcion
2024-09-20 13:48:28 -05:00
3788a08b00
fix: Remove unused args in getDefaultFilename(), use preferred fmt.Fprintf funcion
2024-09-20 13:47:52 -05:00
Arik Jones
160a15dbb1
fix: Use logger instead of log. Move web subcommand initialization to root.go
2024-09-19 11:44:27 -05:00
Arik Jones
d3ba28d03b
fix: Output markdown files should end in *.rollup.md
2024-09-19 10:56:00 -05:00
Arik Jones (aider)
53dcd6eb71
feat: Add support for exclusionary CSS paths in config.go
2024-09-14 20:59:08 -05:00
Arik Jones (aider)
ece9492b30
fix: Remove unused import in cmd/web.go
2024-09-14 20:56:51 -05:00
Arik Jones (aider)
c1755836b5
fix: Move HTML to Markdown conversion to scraper.go
2024-09-14 20:55:35 -05:00
Arik Jones
939cffb55e
fix: Simplify sanitizeFilename function
2024-09-14 20:55:34 -05:00
Arik Jones (aider)
b6de9d211b
fix: Merge duplicate runWeb function and add missing function definitions
2024-09-14 20:42:10 -05:00
Arik Jones (aider)
a6ebf0062a
fix: Add --verbose flag to web subcommand
2024-09-14 20:41:23 -05:00
Arik Jones (aider)
aaff602b3e
fix: Use local getFilenameFromContent function instead of undefined scraper.GetFilenameFromContent
2024-09-14 20:38:06 -05:00
Arik Jones (aider)
52c7de255d
feat: Implement scraping of multiple URLs with optional CSS locators and separate output files
2024-09-14 20:35:35 -05:00
Arik Jones (aider)
f4c368e112
fix: Update web command to properly handle --exclude flag
2024-09-14 17:02:44 -05:00
Arik Jones (aider)
d80151b9eb
fix: reorder flag definitions in cmd/web.go
2024-09-14 17:01:49 -05:00
Arik Jones (aider)
9196708426
fix: Update web command flags
2024-09-14 17:01:17 -05:00
Arik Jones (aider)
7cdd68d020
feat: Separate include and exclude selectors in web scraper
2024-09-14 16:59:59 -05:00
Arik Jones (aider)
d66fd04016
fix: Use - instead of ! to filter unwanted elements
2024-09-14 16:53:42 -05:00
Arik Jones (aider)
e50484a6fa
fix: Remove XPath-related code from cmd/web.go
2024-09-14 16:51:54 -05:00
Arik Jones (aider)
d0ee666b07
refactor: Modify scraper to capture only the main content
2024-09-14 15:20:15 -05:00
Arik Jones (aider)
9660a12549
fix: remove unused import of "github.com/tnypxl/rollup/internal/config"
2024-09-14 15:16:36 -05:00
Arik Jones (aider)
8e89621ef0
fix: Remove redeclaration of cfg in cmd/web.go
2024-09-14 15:16:11 -05:00
Arik Jones (aider)
595c451ad9
feat: Pass scraper configuration to command execution
2024-09-14 15:15:39 -05:00
Arik Jones
3390606916
feat: Add support for time package in web.go
2024-09-14 14:52:44 -05:00
Arik Jones (aider)
50c9e7898d
feat: Implement recursive web scraping and content extraction
2024-09-14 14:46:34 -05:00
Arik Jones
cf99bd8bf1
feat: Implement web command functionality
2024-09-14 14:46:31 -05:00
Arik Jones (aider)
d74213e4ff
fix: resolve build errors in cmd/web.go
2024-09-14 14:43:18 -05:00
Arik Jones (aider)
0494d9433f
feat: Add depth, CSS, and XPath options to web command
2024-09-14 14:42:21 -05:00
Arik Jones (aider)
514bcacd8a
feat: Implement recursive web scraping with configurable depth and content extraction
2024-09-14 14:41:54 -05:00
Arik Jones
0163c4e504
Adds a configuration layer for use rollup.yml which may be preferred over CLI flags.
2024-09-05 23:41:39 -05:00
Arik Jones (aider)
876b2d8917
refactor: Update field names and comment out undefined function call
2024-09-05 23:07:51 -05:00
Arik Jones (aider)
431d084d2c
fix: Resolve cfg redeclaration and update ignore patterns field
2024-09-05 23:06:44 -05:00
Arik Jones
5ab1a97f1c
feat: Implement web scraping and Markdown conversion
2024-09-05 23:06:43 -05:00
Arik Jones (aider)
587ab03f0c
fix: Update summarizeContent function
2024-09-03 11:35:11 -05:00
Arik Jones (aider)
5d9dcc6df4
feat: Update Anthropic SDK usage to latest version
2024-09-03 11:30:38 -05:00
Arik Jones (aider)
129d7f00e4
fix: Update Anthropic SDK usage in cmd/web.go
2024-09-03 11:26:34 -05:00
Arik Jones (aider)
0f5cb3e505
feat: Implement web subcommand to fetch, summarize, and save web content
2024-09-03 11:25:15 -05:00
Arik Jones (aider)
baf92acfd2
feat: add web subcommand
2024-09-03 10:55:15 -05:00