tnypxl
9495ddd788
fix: resolve nil logger panic and CLI URL processing ( #5 )
...
- Initialize logger before Playwright to prevent nil pointer dereference
- Set AllowedPaths for CLI URLs so they get processed by scraper
Co-authored-by: Claude <noreply@anthropic.com >
2025-11-27 11:04:08 -06:00
tnypxl
eb3b611864
Merge branch 'claude/fix-bugs-and-gaps-01DvJSzruQh49DU6XK5AykQU' ( #4 )
2025-11-27 10:50:03 -06:00
tnypxl
877a7876c0
fix: resolve 5 bugs identified in code review ( #3 )
2025-11-27 09:58:09 -06:00
Arik Jones
9341a51d09
fix multi-file output
2024-12-06 17:02:31 -06:00
Arik Jones
9e9ac903e4
remove maxdepth from tests
2024-12-06 15:19:12 -06:00
tnypxl
02e39baf38
flatten scrape config to 'sites:'
...
* flatten scrape config to 'sites:'. Update unit tests and readme.
* remove check for file_extensions configuration.
* show progress indication after 5 seconds.
* add documentation to functions
* fix: remove MaxDepth and link extraction functionality
* fix: Remove MaxDepth references from cmd/web.go
2024-10-14 16:09:58 -05:00
333b9a366c
fix: Resolve playwright function deprecations and io/ioutil function deprecations.
2024-09-24 15:13:36 -05:00
Arik Jones
691832e282
fix: Update expectation
2024-09-22 18:18:03 -05:00
Arik Jones (aider)
31e0fa5ea4
fix: Remove redeclaration of cfg variable in cmd/root.go
2024-09-22 17:07:57 -05:00
Arik Jones (aider)
71f63ddaa8
fix: resolve undefined config variable in cmd/files.go
2024-09-22 17:07:32 -05:00
Arik Jones (aider)
574800c241
fix: Update runRollup function to accept config parameter
2024-09-22 17:06:18 -05:00
Arik Jones (aider)
59994c085c
fix: improve file ignore logic and preserve newlines in extracted content
2024-09-22 16:58:53 -05:00
Arik Jones (aider)
396f092d50
fix: improve file ignore pattern matching for nested directories
2024-09-22 16:58:22 -05:00
Arik Jones (aider)
274ef7ea79
test: enhance and expand test coverage for file operations
2024-09-22 16:56:52 -05:00
Arik Jones
a55e8df02a
refactor: improve error handling and variable naming in TestRunRollup
2024-09-22 16:56:51 -05:00
Arik Jones (aider)
364b185269
fix: resolve test failures in TestRunRollup, TestExtractContentWithCSS, and TestExtractLinks
2024-09-21 16:04:20 -05:00
Arik Jones (aider)
e5d4c514a7
fix: resolve build errors in test files
2024-09-21 15:59:39 -05:00
Arik Jones (aider)
6ff44f81bb
fix: resolve nil pointer dereference in ExtractContentWithCSS test
2024-09-21 15:59:08 -05:00
Arik Jones (aider)
2fd411ce65
test: add debugging info and fix reflect import
2024-09-21 15:57:05 -05:00
Arik Jones
73116e8d82
Fix logging and other issues from preventing scraping
2024-09-21 15:54:33 -05:00
5482621d99
fix: Use preferred fmt.Fprintf funcion
2024-09-20 13:48:28 -05:00
3788a08b00
fix: Remove unused args in getDefaultFilename(), use preferred fmt.Fprintf funcion
2024-09-20 13:47:52 -05:00
Arik Jones (aider)
fca1422104
refactor: improve generate command and use config package
2024-09-19 12:08:34 -05:00
Arik Jones (aider)
2e563836f3
feat: add generate subcommand for creating rollup.yml config
2024-09-19 12:08:09 -05:00
Arik Jones
160a15dbb1
fix: Use logger instead of log. Move web subcommand initialization to root.go
2024-09-19 11:44:27 -05:00
Arik Jones (aider)
eabf1ba23f
feat: add files subcommand and refactor rollup functionality
2024-09-19 11:38:09 -05:00
Arik Jones
eba453f09e
fix: rollup output file name (again)
2024-09-19 11:02:35 -05:00
Arik Jones
d3ba28d03b
fix: Output markdown files should end in *.rollup.md
2024-09-19 10:56:00 -05:00
Arik Jones
197f3affc7
fix: Don't use PersistentPreRunE. Caused the actuall runRollup function to never run.
2024-09-19 10:43:23 -05:00
Arik Jones (aider)
056c3e368e
fix: Update import and usage of Config type in cmd/root.go
2024-09-16 09:53:44 -05:00
Arik Jones (aider)
21d3e8ee68
fix: Handle missing configuration file for help command
2024-09-16 09:52:48 -05:00
Arik Jones (aider)
efee186ae0
fix: Skip config loading and rollup execution for help command
2024-09-16 09:52:25 -05:00
Arik Jones
bb12e3d029
fix: Something in root
2024-09-14 21:25:50 -05:00
Arik Jones (aider)
53dcd6eb71
feat: Add support for exclusionary CSS paths in config.go
2024-09-14 20:59:08 -05:00
Arik Jones (aider)
ece9492b30
fix: Remove unused import in cmd/web.go
2024-09-14 20:56:51 -05:00
Arik Jones (aider)
c1755836b5
fix: Move HTML to Markdown conversion to scraper.go
2024-09-14 20:55:35 -05:00
Arik Jones
939cffb55e
fix: Simplify sanitizeFilename function
2024-09-14 20:55:34 -05:00
Arik Jones (aider)
b6de9d211b
fix: Merge duplicate runWeb function and add missing function definitions
2024-09-14 20:42:10 -05:00
Arik Jones (aider)
a6ebf0062a
fix: Add --verbose flag to web subcommand
2024-09-14 20:41:23 -05:00
Arik Jones (aider)
aaff602b3e
fix: Use local getFilenameFromContent function instead of undefined scraper.GetFilenameFromContent
2024-09-14 20:38:06 -05:00
Arik Jones (aider)
52c7de255d
feat: Implement scraping of multiple URLs with optional CSS locators and separate output files
2024-09-14 20:35:35 -05:00
Arik Jones (aider)
23508df6f4
feat: Add optional logging to the scraper
2024-09-14 19:59:02 -05:00
Arik Jones (aider)
f4c368e112
fix: Update web command to properly handle --exclude flag
2024-09-14 17:02:44 -05:00
Arik Jones (aider)
d80151b9eb
fix: reorder flag definitions in cmd/web.go
2024-09-14 17:01:49 -05:00
Arik Jones (aider)
9196708426
fix: Update web command flags
2024-09-14 17:01:17 -05:00
Arik Jones (aider)
7cdd68d020
feat: Separate include and exclude selectors in web scraper
2024-09-14 16:59:59 -05:00
Arik Jones (aider)
d66fd04016
fix: Use - instead of ! to filter unwanted elements
2024-09-14 16:53:42 -05:00
Arik Jones (aider)
e50484a6fa
fix: Remove XPath-related code from cmd/web.go
2024-09-14 16:51:54 -05:00
Arik Jones (aider)
d0ee666b07
refactor: Modify scraper to capture only the main content
2024-09-14 15:20:15 -05:00
Arik Jones (aider)
bfd70fd786
fix: Add import for scraper package in cmd/root.go
2024-09-14 15:17:18 -05:00