Commit Graph

114 Commits

Author SHA1 Message Date
Claude
ff13012408 fix: address functionality gaps identified in code review
- Wire up --config/-f flag to actually load custom config files
  - Move config loading to PersistentPreRunE in root.go
  - Simplify main.go to just call cmd.Execute()
  - Move Playwright init to web command's PreRunE/PostRunE

- Remove unused functions from cmd/web.go (~90 lines of dead code)
  - Remove writeSingleFile, writeMultipleFiles, generateDefaultFilename
  - Remove scrapeURL, extractAndConvertContent, testExtractAndConvertContent
  - Remove unused mock function from web_test.go

- Add OutputType validation to Config.Validate()
  - Only allow "single", "separate", or empty string
  - Add test cases for valid and invalid output types
2025-11-27 16:05:42 +00:00
Claude
09608cf073 fix: resolve 5 bugs identified in code review
- Fix malformed YAML in config_test.go (incorrect indentation)
- Add validation for empty file_extensions in Config.Validate()
- Remove obsolete max_depth test case (field no longer exists)
- Remove unused global cfg variable in main.go
- Fix race condition in ScrapeSites by counting URLs before goroutines
- Remove unreachable JavaScript code in scroll script, add proper delay
- Standardize file extensions to not include leading dot
2025-11-27 15:56:37 +00:00
tnypxl
7569aff6ec Add CLAUDE.md with project guidance for Claude Code (#2) 2025-11-27 09:29:10 -06:00
Arik Jones
9341a51d09 fix multi-file output 2024-12-06 17:02:31 -06:00
Arik Jones
9e9ac903e4 remove maxdepth from tests 2024-12-06 15:19:12 -06:00
Arik Jones
645626f763 remove maxdepth from tests 2024-12-06 15:17:33 -06:00
tnypxl
02e39baf38 flatten scrape config to 'sites:'
* flatten scrape config to 'sites:'. Update unit tests and readme.
* remove check for file_extensions configuration. 
* show progress indication after 5 seconds.
* add documentation to functions
* fix: remove MaxDepth and link extraction functionality
* fix: Remove MaxDepth references from cmd/web.go
v0.1.0
2024-10-14 16:09:58 -05:00
333b9a366c fix: Resolve playwright function deprecations and io/ioutil function deprecations. v0.0.5 2024-09-24 15:13:36 -05:00
Arik Jones (aider)
1869dae89a docs: update configuration section in README.md v0.0.4 2024-09-22 18:36:17 -05:00
Arik Jones (aider)
d3ff7cb862 docs: Update README.md CLI flag documentation 2024-09-22 18:33:24 -05:00
Arik Jones (aider)
ea410e4abb feat: Update README.md to reflect recent changes in functionality 2024-09-22 18:31:06 -05:00
Arik Jones (aider)
7d8e25b1ad docs: Add CHANGELOG.md with v0.0.3 release notes 2024-09-22 18:20:25 -05:00
Arik Jones
691832e282 fix: Update expectation v0.0.3 2024-09-22 18:18:03 -05:00
Arik Jones (aider)
31e0fa5ea4 fix: Remove redeclaration of cfg variable in cmd/root.go v0.0.2 2024-09-22 17:07:57 -05:00
Arik Jones (aider)
71f63ddaa8 fix: resolve undefined config variable in cmd/files.go 2024-09-22 17:07:32 -05:00
Arik Jones (aider)
574800c241 fix: Update runRollup function to accept config parameter 2024-09-22 17:06:18 -05:00
Arik Jones (aider)
d5a94f5468 fix: remove indentation while preserving HTML structure in ExtractContentWithCSS 2024-09-22 17:00:16 -05:00
Arik Jones (aider)
59994c085c fix: improve file ignore logic and preserve newlines in extracted content 2024-09-22 16:58:53 -05:00
Arik Jones (aider)
396f092d50 fix: improve file ignore pattern matching for nested directories 2024-09-22 16:58:22 -05:00
Arik Jones (aider)
274ef7ea79 test: enhance and expand test coverage for file operations 2024-09-22 16:56:52 -05:00
Arik Jones
a55e8df02a refactor: improve error handling and variable naming in TestRunRollup 2024-09-22 16:56:51 -05:00
Arik Jones (aider)
364b185269 fix: resolve test failures in TestRunRollup, TestExtractContentWithCSS, and TestExtractLinks 2024-09-21 16:04:20 -05:00
Arik Jones (aider)
952c2dda02 refactor: update browser initialization in scraper tests 2024-09-21 16:01:51 -05:00
Arik Jones (aider)
de84d68b4c test: initialize browser before running ExtractLinks test 2024-09-21 16:01:08 -05:00
Arik Jones (aider)
e5d4c514a7 fix: resolve build errors in test files 2024-09-21 15:59:39 -05:00
Arik Jones (aider)
6ff44f81bb fix: resolve nil pointer dereference in ExtractContentWithCSS test 2024-09-21 15:59:08 -05:00
Arik Jones (aider)
2fd411ce65 test: add debugging info and fix reflect import 2024-09-21 15:57:05 -05:00
Arik Jones
73116e8d82 Fix logging and other issues from preventing scraping 2024-09-21 15:54:33 -05:00
5482621d99 fix: Use preferred fmt.Fprintf funcion v0.0.1 2024-09-20 13:48:28 -05:00
3788a08b00 fix: Remove unused args in getDefaultFilename(), use preferred fmt.Fprintf funcion 2024-09-20 13:47:52 -05:00
8ba54001ce cleanup: Ran go mod tidy to clear out an unused dep. 2024-09-20 13:41:51 -05:00
Arik Jones
f9eee282bc docs: Update readme to include generate command that produces default config file. 2024-09-19 12:22:42 -05:00
Arik Jones (aider)
fca1422104 refactor: improve generate command and use config package 2024-09-19 12:08:34 -05:00
Arik Jones (aider)
2e563836f3 feat: add generate subcommand for creating rollup.yml config 2024-09-19 12:08:09 -05:00
Arik Jones
160a15dbb1 fix: Use logger instead of log. Move web subcommand initialization to root.go 2024-09-19 11:44:27 -05:00
Arik Jones (aider)
eabf1ba23f feat: add files subcommand and refactor rollup functionality 2024-09-19 11:38:09 -05:00
Arik Jones
1e88fae75d docs: Update the readme 2024-09-19 11:08:13 -05:00
Arik Jones
eba453f09e fix: rollup output file name (again) 2024-09-19 11:02:35 -05:00
Arik Jones
d3ba28d03b fix: Output markdown files should end in *.rollup.md 2024-09-19 10:56:00 -05:00
Arik Jones
197f3affc7 fix: Don't use PersistentPreRunE. Caused the actuall runRollup function to never run. 2024-09-19 10:43:23 -05:00
Arik Jones (aider)
7f468a05bd feat: install only Chromium browser 2024-09-17 14:51:09 -05:00
Arik Jones (aider)
4586b5daaa fix: Install Playwright and browsers before initializing 2024-09-17 14:48:15 -05:00
Arik Jones (aider)
056c3e368e fix: Update import and usage of Config type in cmd/root.go 2024-09-16 09:53:44 -05:00
Arik Jones (aider)
21d3e8ee68 fix: Handle missing configuration file for help command 2024-09-16 09:52:48 -05:00
Arik Jones (aider)
efee186ae0 fix: Skip config loading and rollup execution for help command 2024-09-16 09:52:25 -05:00
Arik Jones
41fb9e3fad Correction in web scraping example. 2024-09-14 21:38:17 -05:00
Arik Jones (aider)
6cb2f03d74 feat: Add web scraping functionality and exclusionary CSS paths 2024-09-14 21:26:59 -05:00
Arik Jones
bb12e3d029 fix: Something in root 2024-09-14 21:25:50 -05:00
Arik Jones (aider)
53dcd6eb71 feat: Add support for exclusionary CSS paths in config.go 2024-09-14 20:59:08 -05:00
Arik Jones (aider)
ece9492b30 fix: Remove unused import in cmd/web.go 2024-09-14 20:56:51 -05:00