Commit Graph

  • ece9492b30 fix: Remove unused import in cmd/web.go Arik Jones (aider) 2024-09-14 20:56:51 -05:00
  • c1755836b5 fix: Move HTML to Markdown conversion to scraper.go Arik Jones (aider) 2024-09-14 20:55:35 -05:00
  • 939cffb55e fix: Simplify sanitizeFilename function Arik Jones 2024-09-14 20:55:34 -05:00
  • b6de9d211b fix: Merge duplicate runWeb function and add missing function definitions Arik Jones (aider) 2024-09-14 20:42:10 -05:00
  • a6ebf0062a fix: Add --verbose flag to web subcommand Arik Jones (aider) 2024-09-14 20:41:23 -05:00
  • aaff602b3e fix: Use local getFilenameFromContent function instead of undefined scraper.GetFilenameFromContent Arik Jones (aider) 2024-09-14 20:38:06 -05:00
  • 6f4750c900 fix: Remove references to non-existent CSSLocator field in Config struct Arik Jones (aider) 2024-09-14 20:36:31 -05:00
  • 52c7de255d feat: Implement scraping of multiple URLs with optional CSS locators and separate output files Arik Jones (aider) 2024-09-14 20:35:35 -05:00
  • 5264023cba feat: add MIT license Arik Jones (aider) 2024-09-14 20:15:05 -05:00
  • 87c2a81375 feat: Add README.md Arik Jones (aider) 2024-09-14 20:13:00 -05:00
  • b1db362a94 fix: Initialize logger before calling InitPlaywright Arik Jones (aider) 2024-09-14 19:59:39 -05:00
  • 23508df6f4 feat: Add optional logging to the scraper Arik Jones (aider) 2024-09-14 19:59:02 -05:00
  • 01d6b2f54f fix: Improve page content extraction in scraper Arik Jones 2024-09-14 19:59:01 -05:00
  • 3378402fb9 fix: Handle missing content in ProcessHTMLContent Arik Jones (aider) 2024-09-14 19:43:58 -05:00
  • 2ab0d74279 fix: Update scraper to handle empty URLs Arik Jones 2024-09-14 19:42:38 -05:00
  • eaa7135eab feat: Improve content extraction with fallback to body Arik Jones (aider) 2024-09-14 17:05:05 -05:00
  • f4c368e112 fix: Update web command to properly handle --exclude flag Arik Jones (aider) 2024-09-14 17:02:44 -05:00
  • d80151b9eb fix: reorder flag definitions in cmd/web.go Arik Jones (aider) 2024-09-14 17:01:49 -05:00
  • 9196708426 fix: Update web command flags Arik Jones (aider) 2024-09-14 17:01:17 -05:00
  • 7cdd68d020 feat: Separate include and exclude selectors in web scraper Arik Jones (aider) 2024-09-14 16:59:59 -05:00
  • 39e06ee9d5 fix: remove space between minus and CSS path in parseSelectors Arik Jones (aider) 2024-09-14 16:54:34 -05:00
  • d66fd04016 fix: Use - instead of ! to filter unwanted elements Arik Jones (aider) 2024-09-14 16:53:42 -05:00
  • e50484a6fa fix: Remove XPath-related code from cmd/web.go Arik Jones (aider) 2024-09-14 16:51:54 -05:00
  • 56d5a8a194 refactor: Remove XPath support Arik Jones (aider) 2024-09-14 16:51:18 -05:00
  • 09f8ed07c2 fix: Remove unused variable excludeXPaths in ExtractContentWithXPath function Arik Jones (aider) 2024-09-14 16:50:34 -05:00
  • f1af20e95e feat: Add support for excluding child elements in content extraction Arik Jones (aider) 2024-09-14 16:49:32 -05:00
  • d0ee666b07 refactor: Modify scraper to capture only the main content Arik Jones (aider) 2024-09-14 15:20:15 -05:00
  • bfd70fd786 fix: Add import for scraper package in cmd/root.go Arik Jones (aider) 2024-09-14 15:17:18 -05:00
  • 8b85d755af fix: Update Execute function to accept configuration and scraper config Arik Jones (aider) 2024-09-14 15:17:00 -05:00
  • 9660a12549 fix: remove unused import of "github.com/tnypxl/rollup/internal/config" Arik Jones (aider) 2024-09-14 15:16:36 -05:00
  • 8e89621ef0 fix: Remove redeclaration of cfg in cmd/web.go Arik Jones (aider) 2024-09-14 15:16:11 -05:00
  • 595c451ad9 feat: Pass scraper configuration to command execution Arik Jones (aider) 2024-09-14 15:15:39 -05:00
  • 1a57be80fa fix: Remove print media emulation and improve CSS selector extraction Arik Jones (aider) 2024-09-14 15:14:53 -05:00
  • a3b23a6d34 ... Arik Jones 2024-09-14 15:11:24 -05:00
  • 8932f503c6 feat: Pass configuration to command execution Arik Jones (aider) 2024-09-14 15:09:57 -05:00
  • ea12ad631c fix: Fix assignment mismatch in ExtractContentWithCSS function Arik Jones (aider) 2024-09-14 14:54:04 -05:00
  • 885f3fc2b8 feat: Add missing scraper functions Arik Jones (aider) 2024-09-14 14:52:45 -05:00
  • 3390606916 feat: Add support for time package in web.go Arik Jones 2024-09-14 14:52:44 -05:00
  • 50c9e7898d feat: Implement recursive web scraping and content extraction Arik Jones (aider) 2024-09-14 14:46:34 -05:00
  • cf99bd8bf1 feat: Implement web command functionality Arik Jones 2024-09-14 14:46:31 -05:00
  • d74213e4ff fix: resolve build errors in cmd/web.go Arik Jones (aider) 2024-09-14 14:43:18 -05:00
  • 0494d9433f feat: Add depth, CSS, and XPath options to web command Arik Jones (aider) 2024-09-14 14:42:21 -05:00
  • 514bcacd8a feat: Implement recursive web scraping with configurable depth and content extraction Arik Jones (aider) 2024-09-14 14:41:54 -05:00
  • 0163c4e504 Adds a configuration layer for use rollup.yml which may be preferred over CLI flags. Arik Jones 2024-09-05 23:41:39 -05:00
  • f376f186c2 fix: Update cmd/root.go to use the correct field name for ignore patterns Arik Jones (aider) 2024-09-05 23:08:38 -05:00
  • 876b2d8917 refactor: Update field names and comment out undefined function call Arik Jones (aider) 2024-09-05 23:07:51 -05:00
  • 4caf6f5646 fix: Update cfg.IgnorePatterns to cfg.Ignore in cmd/root.go Arik Jones (aider) 2024-09-05 23:07:15 -05:00
  • 431d084d2c fix: Resolve cfg redeclaration and update ignore patterns field Arik Jones (aider) 2024-09-05 23:06:44 -05:00
  • 5ab1a97f1c feat: Implement web scraping and Markdown conversion Arik Jones 2024-09-05 23:06:43 -05:00
  • 5824f362b6 feat: Add ability to ignore specific files and/or file globs Arik Jones (aider) 2024-09-05 23:05:52 -05:00
  • 3d2a42ddc2 feat: Add support for configuration file Arik Jones 2024-09-05 23:05:51 -05:00
  • 587ab03f0c fix: Update summarizeContent function Arik Jones (aider) 2024-09-03 11:35:11 -05:00
  • 5d9dcc6df4 feat: Update Anthropic SDK usage to latest version Arik Jones (aider) 2024-09-03 11:30:38 -05:00
  • f0dce84dbd fix: update anthropic-sdk-go dependency version Arik Jones 2024-09-03 11:30:35 -05:00
  • 9ed53c286a fix: update go.mod file Arik Jones (aider) 2024-09-03 11:26:46 -05:00
  • 129d7f00e4 fix: Update Anthropic SDK usage in cmd/web.go Arik Jones (aider) 2024-09-03 11:26:34 -05:00
  • af3bb58d7e feat: add Anthropic SDK dependency Arik Jones 2024-09-03 11:26:33 -05:00
  • 0f5cb3e505 feat: Implement web subcommand to fetch, summarize, and save web content Arik Jones (aider) 2024-09-03 11:25:15 -05:00
  • ef89da6c39 Add main.go Arik Jones 2024-09-03 11:07:45 -05:00
  • 8d402affbd Update go mod name Arik Jones 2024-09-03 11:05:28 -05:00
  • baf92acfd2 feat: add web subcommand Arik Jones (aider) 2024-09-03 10:55:15 -05:00
  • c2be1b57c0 Ignore rollup output files Arik Jones 2024-09-03 10:53:23 -05:00
  • 8c64479213 Remove conventions document from trackings Arik Jones 2024-09-03 10:51:30 -05:00
  • 7dd149417a Initial commit Arik Jones 2024-09-03 10:50:10 -05:00
  • 8ff624350b gitignore Arik Jones 2024-09-02 10:08:01 -05:00