Commit Graph

  • 61dde38773 docs: update README to reflect current functionality (#6) main tnypxl 2025-11-27 11:42:19 -06:00
  • 9495ddd788 fix: resolve nil logger panic and CLI URL processing (#5) tnypxl 2025-11-27 11:04:08 -06:00
  • eb3b611864 Merge branch 'claude/fix-bugs-and-gaps-01DvJSzruQh49DU6XK5AykQU' (#4) tnypxl 2025-11-27 10:50:03 -06:00
  • 52c326eed1 Merge branch 'claude/fix-bugs-and-gaps-01DvJSzruQh49DU6XK5AykQU' claude/fix-bugs-and-gaps-01DvJSzruQh49DU6XK5AykQU claude/code-review-analysis-01DvJSzruQh49DU6XK5AykQU Claude 2025-11-27 16:41:10 +00:00
  • ff13012408 fix: address functionality gaps identified in code review Claude 2025-11-27 16:05:42 +00:00
  • 877a7876c0 fix: resolve 5 bugs identified in code review (#3) tnypxl 2025-11-27 09:58:09 -06:00
  • 09608cf073 fix: resolve 5 bugs identified in code review Claude 2025-11-27 15:56:37 +00:00
  • 7569aff6ec Add CLAUDE.md with project guidance for Claude Code (#2) tnypxl 2025-11-27 09:29:10 -06:00
  • 9341a51d09 fix multi-file output Arik Jones 2024-12-06 17:02:31 -06:00
  • 9e9ac903e4 remove maxdepth from tests Arik Jones 2024-12-06 15:19:12 -06:00
  • 645626f763 remove maxdepth from tests Arik Jones 2024-12-06 15:17:33 -06:00
  • eaaa6449b4 remove old commands refactor-20241114 Arik Jones 2024-11-23 13:49:01 -06:00
  • 318951063a remove old implementation of the web scraper Arik Jones 2024-11-23 12:36:46 -06:00
  • dcf32eaeeb fix: Load configuration before running rollup feat/click-and-scrape Arik Jones (aider) 2024-10-19 13:34:35 -05:00
  • 02e39baf38 flatten scrape config to 'sites:' v0.1.0 tnypxl 2024-10-14 16:09:58 -05:00
  • 20b6218c7f fix: Remove MaxDepth references from cmd/web.go improve-config-management Arik Jones (aider) 2024-10-12 20:45:02 -05:00
  • 8a16cec600 fix: Remove MaxDepth and link extraction functionality from scraper Arik Jones (aider) 2024-10-12 20:43:54 -05:00
  • 7676638fa4 fix: remove MaxDepth and link extraction functionality Arik Jones (aider) 2024-10-12 20:42:16 -05:00
  • ad5147551a Add function documentation Arik Jones 2024-10-12 20:33:11 -05:00
  • 6870258944 remove check for file_extensions configuration. show progress indication after 5 seconds. Arik Jones 2024-10-12 15:10:37 -05:00
  • da92da21dc flatten scrape config to 'sites:'. Update unit tests and readme. Arik Jones 2024-10-12 13:47:21 -05:00
  • e42ad24999 docs: update configuration section in README.md to include scrape parameters and example usage link-navigation Arik Jones (aider) 2024-09-30 14:20:17 -05:00
  • 01465a08b7 fix: set default values for requests_per_second and burst_limit in configuration to prevent rate limiter errors Arik Jones (aider) 2024-09-30 14:19:00 -05:00
  • e3355269b8 refactor: remove redundant scraping functions and update runWeb to utilize scraper.ScrapeSites for improved maintainability Arik Jones (aider) 2024-09-30 14:10:37 -05:00
  • 54c3776baf fix: update scrapeSingleURL calls to include visited map and currentDepth for thread safety and correct functionality Arik Jones (aider) 2024-09-30 14:08:16 -05:00
  • ee1561c502 feat: add LinksContainerSelector to SiteConfig and enhance scraping logic with depth control and link extraction Arik Jones (aider) 2024-09-30 14:05:10 -05:00
  • 5e8a257ff8 feat: implement links container selector for targeted scraping of linked content Arik Jones (aider) 2024-09-30 14:04:41 -05:00
  • 333b9a366c fix: Resolve playwright function deprecations and io/ioutil function deprecations. v0.0.5 Arik Jones 2024-09-24 15:13:36 -05:00
  • 1869dae89a docs: update configuration section in README.md v0.0.4 Arik Jones (aider) 2024-09-22 18:36:17 -05:00
  • d3ff7cb862 docs: Update README.md CLI flag documentation Arik Jones (aider) 2024-09-22 18:33:24 -05:00
  • ea410e4abb feat: Update README.md to reflect recent changes in functionality Arik Jones (aider) 2024-09-22 18:31:06 -05:00
  • 7d8e25b1ad docs: Add CHANGELOG.md with v0.0.3 release notes Arik Jones (aider) 2024-09-22 18:20:25 -05:00
  • 691832e282 fix: Update expectation v0.0.3 Arik Jones 2024-09-22 18:18:03 -05:00
  • 31e0fa5ea4 fix: Remove redeclaration of cfg variable in cmd/root.go v0.0.2 Arik Jones (aider) 2024-09-22 17:07:57 -05:00
  • 71f63ddaa8 fix: resolve undefined config variable in cmd/files.go Arik Jones (aider) 2024-09-22 17:07:32 -05:00
  • 574800c241 fix: Update runRollup function to accept config parameter Arik Jones (aider) 2024-09-22 17:06:18 -05:00
  • d5a94f5468 fix: remove indentation while preserving HTML structure in ExtractContentWithCSS Arik Jones (aider) 2024-09-22 17:00:16 -05:00
  • 59994c085c fix: improve file ignore logic and preserve newlines in extracted content Arik Jones (aider) 2024-09-22 16:58:53 -05:00
  • 396f092d50 fix: improve file ignore pattern matching for nested directories Arik Jones (aider) 2024-09-22 16:58:22 -05:00
  • 274ef7ea79 test: enhance and expand test coverage for file operations Arik Jones (aider) 2024-09-22 16:56:52 -05:00
  • a55e8df02a refactor: improve error handling and variable naming in TestRunRollup Arik Jones 2024-09-22 16:56:51 -05:00
  • 364b185269 fix: resolve test failures in TestRunRollup, TestExtractContentWithCSS, and TestExtractLinks Arik Jones (aider) 2024-09-21 16:04:20 -05:00
  • 952c2dda02 refactor: update browser initialization in scraper tests Arik Jones (aider) 2024-09-21 16:01:51 -05:00
  • de84d68b4c test: initialize browser before running ExtractLinks test Arik Jones (aider) 2024-09-21 16:01:08 -05:00
  • e5d4c514a7 fix: resolve build errors in test files Arik Jones (aider) 2024-09-21 15:59:39 -05:00
  • 6ff44f81bb fix: resolve nil pointer dereference in ExtractContentWithCSS test Arik Jones (aider) 2024-09-21 15:59:08 -05:00
  • 2fd411ce65 test: add debugging info and fix reflect import Arik Jones (aider) 2024-09-21 15:57:05 -05:00
  • 73116e8d82 Fix logging and other issues from preventing scraping Arik Jones 2024-09-21 15:54:33 -05:00
  • beca098b0d cleanup: go module changes fix-logging Arik Jones 2024-09-21 15:39:14 -05:00
  • 96ac2dbfc0 fix: set default rate limiter values to allow scraping Arik Jones (aider) 2024-09-21 15:30:58 -05:00
  • 6bc76ff9da fix: resolve undefined log errors in web.go Arik Jones (aider) 2024-09-21 15:28:02 -05:00
  • 41268853ba fix: add missing os package import in scraper.go Arik Jones (aider) 2024-09-21 15:27:00 -05:00
  • a3f3f6e560 fix: update logger to use stdout for verbose output Arik Jones (aider) 2024-09-21 15:26:21 -05:00
  • 08e0e29463 feat: add logger setup in web command Arik Jones (aider) 2024-09-21 10:58:14 -05:00
  • 8f824d8990 feat: enhance logging in runWeb function for better debugging Arik Jones (aider) 2024-09-21 10:57:51 -05:00
  • 6f39dd6726 feat: enhance logging in scraper for better debugging Arik Jones (aider) 2024-09-21 10:57:10 -05:00
  • 67c62456fa feat: enhance logging for scraping process Arik Jones (aider) 2024-09-21 10:56:26 -05:00
  • 751ea5828d refactor: update ScrapeSites to handle base_url and allowed_paths Arik Jones (aider) 2024-09-21 10:55:01 -05:00
  • 5482621d99 fix: Use preferred fmt.Fprintf funcion v0.0.1 Arik Jones 2024-09-20 13:48:28 -05:00
  • 3788a08b00 fix: Remove unused args in getDefaultFilename(), use preferred fmt.Fprintf funcion Arik Jones 2024-09-20 13:47:52 -05:00
  • 8ba54001ce cleanup: Ran go mod tidy to clear out an unused dep. Arik Jones 2024-09-20 13:41:51 -05:00
  • 7e4f4cdbb6 fix: update writeMultipleFiles function to handle multiple files Arik Jones (aider) 2024-09-19 16:37:07 -05:00
  • f1f0bd3895 feat: add MaxDepth to URL-based configuration and use outputType directly Arik Jones (aider) 2024-09-19 16:36:24 -05:00
  • 57bbc5a1ac refactor: update writeMultipleFiles to create single output file Arik Jones (aider) 2024-09-19 16:35:46 -05:00
  • 32499abbc0 fix: improve URL parsing and title extraction in getFilenameFromContent Arik Jones (aider) 2024-09-19 16:33:55 -05:00
  • 237ed512fc fix: handle error from getFilenameFromContent in writeMultipleFiles Arik Jones (aider) 2024-09-19 16:32:30 -05:00
  • 7c8fcc3261 fix: update getFilenameFromContent to handle invalid URLs and use .rollup.md suffix Arik Jones (aider) 2024-09-19 16:31:43 -05:00
  • 30e11153f9 refactor: update getFilenameFromContent to remove http from filenames Arik Jones (aider) 2024-09-19 16:29:55 -05:00
  • 0219881f61 fix: remove unused import in cmd/web_test.go Arik Jones (aider) 2024-09-19 16:26:38 -05:00
  • c77ae918c5 refactor: remove redundant variable declarations in test file Arik Jones (aider) 2024-09-19 16:25:30 -05:00
  • 1b696ce9c6 refactor: use wrapper functions for easier testing Arik Jones (aider) 2024-09-19 16:25:02 -05:00
  • df1178cb03 test: refactor TestScrapeURL to use local mock functions Arik Jones (aider) 2024-09-19 16:24:23 -05:00
  • c4831dfea2 fix: resolve compilation errors in web_test.go Arik Jones (aider) 2024-09-19 16:23:40 -05:00
  • 3c22d8034d fix: correct import path and update Config struct usage in test Arik Jones (aider) 2024-09-19 16:23:01 -05:00
  • c7791814c9 fix: add missing imports and correct Config reference in files_test.go Arik Jones (aider) 2024-09-19 16:21:11 -05:00
  • e184cef444 test: add unit tests for cmd and internal packages Arik Jones (aider) 2024-09-19 16:15:32 -05:00
  • 702665bb2e fix: import config package to resolve undefined error Arik Jones (aider) 2024-09-19 16:12:30 -05:00
  • 1d02cab585 fix: resolve type mismatch for PathOverrides in SiteConfig Arik Jones (aider) 2024-09-19 16:11:14 -05:00
  • e3fddf101c fix: resolve undefined types and import issues in scraper.go Arik Jones (aider) 2024-09-19 16:10:06 -05:00
  • 569ff9924d feat: implement site-based scraping with path overrides Arik Jones (aider) 2024-09-19 16:06:55 -05:00
  • 1d38e4157c fix: add Scrape field to Config struct and create ScrapeConfig Arik Jones (aider) 2024-09-19 15:23:35 -05:00
  • d44fabf783 feat: implement rate limiting for URL scraping Arik Jones (aider) 2024-09-19 15:22:02 -05:00
  • f9eee282bc docs: Update readme to include generate command that produces default config file. Arik Jones 2024-09-19 12:22:42 -05:00
  • fca1422104 refactor: improve generate command and use config package Arik Jones (aider) 2024-09-19 12:08:34 -05:00
  • 2e563836f3 feat: add generate subcommand for creating rollup.yml config Arik Jones (aider) 2024-09-19 12:08:09 -05:00
  • 160a15dbb1 fix: Use logger instead of log. Move web subcommand initialization to root.go Arik Jones 2024-09-19 11:44:27 -05:00
  • eabf1ba23f feat: add files subcommand and refactor rollup functionality Arik Jones (aider) 2024-09-19 11:38:09 -05:00
  • 1e88fae75d docs: Update the readme Arik Jones 2024-09-19 11:08:13 -05:00
  • eba453f09e fix: rollup output file name (again) Arik Jones 2024-09-19 11:02:35 -05:00
  • d3ba28d03b fix: Output markdown files should end in *.rollup.md Arik Jones 2024-09-19 10:56:00 -05:00
  • 197f3affc7 fix: Don't use PersistentPreRunE. Caused the actuall runRollup function to never run. Arik Jones 2024-09-19 10:43:23 -05:00
  • 7f468a05bd feat: install only Chromium browser Arik Jones (aider) 2024-09-17 14:51:09 -05:00
  • 4586b5daaa fix: Install Playwright and browsers before initializing Arik Jones (aider) 2024-09-17 14:48:15 -05:00
  • 056c3e368e fix: Update import and usage of Config type in cmd/root.go Arik Jones (aider) 2024-09-16 09:53:44 -05:00
  • 21d3e8ee68 fix: Handle missing configuration file for help command Arik Jones (aider) 2024-09-16 09:52:48 -05:00
  • efee186ae0 fix: Skip config loading and rollup execution for help command Arik Jones (aider) 2024-09-16 09:52:25 -05:00
  • 41fb9e3fad Correction in web scraping example. Arik Jones 2024-09-14 21:38:17 -05:00
  • 6cb2f03d74 feat: Add web scraping functionality and exclusionary CSS paths Arik Jones (aider) 2024-09-14 21:26:59 -05:00
  • bb12e3d029 fix: Something in root Arik Jones 2024-09-14 21:25:50 -05:00
  • 53dcd6eb71 feat: Add support for exclusionary CSS paths in config.go Arik Jones (aider) 2024-09-14 20:59:08 -05:00