Commit Graph

47 Commits

Author SHA1 Message Date
Arik Jones (aider)
8f824d8990 feat: enhance logging in runWeb function for better debugging 2024-09-21 10:57:51 -05:00
Arik Jones (aider)
7e4f4cdbb6 fix: update writeMultipleFiles function to handle multiple files 2024-09-19 16:37:07 -05:00
Arik Jones (aider)
f1f0bd3895 feat: add MaxDepth to URL-based configuration and use outputType directly 2024-09-19 16:36:24 -05:00
Arik Jones (aider)
57bbc5a1ac refactor: update writeMultipleFiles to create single output file 2024-09-19 16:35:46 -05:00
Arik Jones (aider)
32499abbc0 fix: improve URL parsing and title extraction in getFilenameFromContent 2024-09-19 16:33:55 -05:00
Arik Jones (aider)
237ed512fc fix: handle error from getFilenameFromContent in writeMultipleFiles 2024-09-19 16:32:30 -05:00
Arik Jones (aider)
7c8fcc3261 fix: update getFilenameFromContent to handle invalid URLs and use .rollup.md suffix 2024-09-19 16:31:43 -05:00
Arik Jones (aider)
30e11153f9 refactor: update getFilenameFromContent to remove http from filenames 2024-09-19 16:29:55 -05:00
Arik Jones (aider)
1b696ce9c6 refactor: use wrapper functions for easier testing 2024-09-19 16:25:02 -05:00
Arik Jones (aider)
702665bb2e fix: import config package to resolve undefined error 2024-09-19 16:12:30 -05:00
Arik Jones (aider)
1d02cab585 fix: resolve type mismatch for PathOverrides in SiteConfig 2024-09-19 16:11:14 -05:00
Arik Jones (aider)
569ff9924d feat: implement site-based scraping with path overrides 2024-09-19 16:06:55 -05:00
Arik Jones
160a15dbb1 fix: Use logger instead of log. Move web subcommand initialization to root.go 2024-09-19 11:44:27 -05:00
Arik Jones
d3ba28d03b fix: Output markdown files should end in *.rollup.md 2024-09-19 10:56:00 -05:00
Arik Jones (aider)
53dcd6eb71 feat: Add support for exclusionary CSS paths in config.go 2024-09-14 20:59:08 -05:00
Arik Jones (aider)
ece9492b30 fix: Remove unused import in cmd/web.go 2024-09-14 20:56:51 -05:00
Arik Jones (aider)
c1755836b5 fix: Move HTML to Markdown conversion to scraper.go 2024-09-14 20:55:35 -05:00
Arik Jones
939cffb55e fix: Simplify sanitizeFilename function 2024-09-14 20:55:34 -05:00
Arik Jones (aider)
b6de9d211b fix: Merge duplicate runWeb function and add missing function definitions 2024-09-14 20:42:10 -05:00
Arik Jones (aider)
a6ebf0062a fix: Add --verbose flag to web subcommand 2024-09-14 20:41:23 -05:00
Arik Jones (aider)
aaff602b3e fix: Use local getFilenameFromContent function instead of undefined scraper.GetFilenameFromContent 2024-09-14 20:38:06 -05:00
Arik Jones (aider)
52c7de255d feat: Implement scraping of multiple URLs with optional CSS locators and separate output files 2024-09-14 20:35:35 -05:00
Arik Jones (aider)
f4c368e112 fix: Update web command to properly handle --exclude flag 2024-09-14 17:02:44 -05:00
Arik Jones (aider)
d80151b9eb fix: reorder flag definitions in cmd/web.go 2024-09-14 17:01:49 -05:00
Arik Jones (aider)
9196708426 fix: Update web command flags 2024-09-14 17:01:17 -05:00
Arik Jones (aider)
7cdd68d020 feat: Separate include and exclude selectors in web scraper 2024-09-14 16:59:59 -05:00
Arik Jones (aider)
d66fd04016 fix: Use - instead of ! to filter unwanted elements 2024-09-14 16:53:42 -05:00
Arik Jones (aider)
e50484a6fa fix: Remove XPath-related code from cmd/web.go 2024-09-14 16:51:54 -05:00
Arik Jones (aider)
d0ee666b07 refactor: Modify scraper to capture only the main content 2024-09-14 15:20:15 -05:00
Arik Jones (aider)
9660a12549 fix: remove unused import of "github.com/tnypxl/rollup/internal/config" 2024-09-14 15:16:36 -05:00
Arik Jones (aider)
8e89621ef0 fix: Remove redeclaration of cfg in cmd/web.go 2024-09-14 15:16:11 -05:00
Arik Jones (aider)
595c451ad9 feat: Pass scraper configuration to command execution 2024-09-14 15:15:39 -05:00
Arik Jones
3390606916 feat: Add support for time package in web.go 2024-09-14 14:52:44 -05:00
Arik Jones (aider)
50c9e7898d feat: Implement recursive web scraping and content extraction 2024-09-14 14:46:34 -05:00
Arik Jones
cf99bd8bf1 feat: Implement web command functionality 2024-09-14 14:46:31 -05:00
Arik Jones (aider)
d74213e4ff fix: resolve build errors in cmd/web.go 2024-09-14 14:43:18 -05:00
Arik Jones (aider)
0494d9433f feat: Add depth, CSS, and XPath options to web command 2024-09-14 14:42:21 -05:00
Arik Jones (aider)
514bcacd8a feat: Implement recursive web scraping with configurable depth and content extraction 2024-09-14 14:41:54 -05:00
Arik Jones
0163c4e504 Adds a configuration layer for use rollup.yml which may be preferred over CLI flags. 2024-09-05 23:41:39 -05:00
Arik Jones (aider)
876b2d8917 refactor: Update field names and comment out undefined function call 2024-09-05 23:07:51 -05:00
Arik Jones (aider)
431d084d2c fix: Resolve cfg redeclaration and update ignore patterns field 2024-09-05 23:06:44 -05:00
Arik Jones
5ab1a97f1c feat: Implement web scraping and Markdown conversion 2024-09-05 23:06:43 -05:00
Arik Jones (aider)
587ab03f0c fix: Update summarizeContent function 2024-09-03 11:35:11 -05:00
Arik Jones (aider)
5d9dcc6df4 feat: Update Anthropic SDK usage to latest version 2024-09-03 11:30:38 -05:00
Arik Jones (aider)
129d7f00e4 fix: Update Anthropic SDK usage in cmd/web.go 2024-09-03 11:26:34 -05:00
Arik Jones (aider)
0f5cb3e505 feat: Implement web subcommand to fetch, summarize, and save web content 2024-09-03 11:25:15 -05:00
Arik Jones (aider)
baf92acfd2 feat: add web subcommand 2024-09-03 10:55:15 -05:00